Generation of acyl amino acids

ABSTRACT

In certain embodiments, the present invention comprises compositions and methods useful in the generation of acyl amino acids. In certain embodiments, the present invention provides an engineered polypeptide comprising a peptide synthetase domain; in some such embodiments, the engineered polypeptide comprises only a single peptide synthetase domain. In some embodiments, the present invention provides an engineered peptide synthetase that is substantially free of a thioesterase domain, and/or a reductase domain. In certain embodiments, the present invention provides an acyl amino acid composition comprising a plurality of different forms of an acyl amino acid. In some such compositions, substantially all of the acyl amino acids within the composition contain the same amino acid moiety and differ with respect to acyl moiety. We also described populations where the fatty acid si for example 95% one length (C14, myristic).

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a division of U.S. application Ser. No. 14/776,805,filed Sep. 15, 2015, which is a U.S. national stage application under 35U.S.C § 371 of International Patent Application No. PCT/US2014/029150,filed Mar. 14, 2014, which claims the benefit of U.S. ProvisionalApplication No. 61/788,346, filed Mar. 15, 2013, the contents of each ofwhich are hereby incorporated herein in their entirety.

SEQUENCE LISTING

The present specification makes reference to a Sequence Listing(submitted electrionally as a .txt file named“2003320_0162_Sequence_listing.txt” on May 14, 2018). The .txt file wasgenerated on May 13, 2018, and is 199 kilobytes in size. The entirecontents of the Sequence Listing are hereby incorporated by reference.

BACKGROUND

Acyl amino acids are commercially important compounds. Many haveadvantageous characteristics and are sold as surfactants, antibiotics,anti-insect agents and as a variety of other important agents.

Traditionally, acyl amino acids have been manufactured chemically. Suchchemical manufacturing methods are hampered by a variety of shortcomingsincluding the ease of obtaining and storing the starting materials, thenecessity of using harsh and sometimes dangerous chemical reagents inthe manufacturing process, the difficulty and efficiency of thesynthesis itself, the fiscal and environmental cost of disposing ofchemical by-products, etc. Thus, new compositions and methods for theefficient and cost-effective synthesis of acyl amino acids andmanufacture on a commercial scale would be beneficial.

Recently, important technologies have been developed that permitproduction of acyl amino acids by engineered peptide synthetasepolypeptides (See U.S. Pat. No. 7,981,685, issued Jul. 19, 2011 andincorporated herein by reference in its entirety). Improvements and/orsupplements to such technologies would be desirable and beneficial.

SUMMARY OF THE INVENTION

In certain embodiments, the present invention comprises compositions andmethods useful in the generation of acyl amino acids. In certainembodiments, the present invention provides an engineered polypeptidecomprising a peptide synthetase domain; in some such embodiments, theengineered polypeptide comprises only a single peptide synthetasedomain. In some embodiments, the present invention provides anengineered peptide synthetase that is substantially free of athioesterase domain, and/or a reductase domain.

In certain embodiments, the present invention provides an acyl aminoacid composition comprising a plurality of different forms of an acylamino acid. In some such compositions, substantially all of the acylamino acids within the composition contain the same amino acid moietyand differ with respect to acyl moiety. We also described populationswhere the fatty acid si for example 95% one length (C14, myristic).

In some embodiments, the present invention provides a method of makingan acyl amino acid composition by contacting an engineered peptidesynthetase with an amino acid substrate and an acyl entity substrate forthe engineered peptide synthetase, under conditions and for a timesufficient for an acyl amino acid composition to be made. In someembodiments, the method comprises providing a cell engineered to expressthe engineered peptide synthetase. In some embodiments, the engineeredpeptide synthetase does not include a thioesterase domain; in someembodiments, the engineered peptide synthetase does not include areductase domain; in some embodiments, the engineered peptide synthetaseincludes neither a thioesterase domain not a reductase domain.

In some embodiments, an amino acid substrate is or comprises an aminoacid as set forth herein.

In some embodiments, an acyl entity substrate is or comprises a fattyacid moiety. In some embodiments, an acyl entity substrate is orcomprises a fatty acid.

The present invention provides cells engineered to express at least oneengineered peptide synthetase that synthesizes an acyl amino acid.

In some embodiments, the present invention comprises an an acyl aminoacid composition produced by an engineered peptide synthetase.

The present invention provides methods of preparing a productcomprising: providing or obtaining an acyl amino acid compositionprepared in an engineered host (e.g., microbial) cell; optionallyenriching the acyl amino acid composition for a particular acyl aminoacid; and, in some embodiments, combining the enriched acyl amino acidcomposition with at least one other component to produce a product.

In some embodiments, the invention provides a method comprising stepsof: contacting an engineered peptide synthetase polypeptide thatcomprises a single peptide synthetase domain and lacks either of athioesterase domain, and/or a reductase domain with (i) an amino acidsubstrate of the peptide synthetase polypeptide; and (ii) an acyl moietysubstrate of the peptide synthetase polypeptide, the contacting beingperformed under conditions and for a time sufficient that the engineeredpeptide synthetase polypeptide covalently links the acyl moiety from theacyl moiety substrate to the amino acid so that an acyl amino acid isgenerated.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts β-hydroxy myristoyl glutamate.

FIG. 2 depicts β-hydroxy myristoyl diaminopropionic acid, as furtherdescribed in Example 6.

FIG. 3 depicts a betaine derived from β-hydroxy myristoyldiaminopropionic acid.

FIG. 4 depicts cocoyl glycinate.

FIG. 5 depicts LCMS analysis of FA-Gly. The 300 Dalton species is FA-Gluwith a 14 carbon fatty acid tail. The 600 Dalton species is a dimer ofthe 300 Dalton species. The 314 Dalton species is FA-Glu with a 15carbon fatty acid tail. The 628 Dalton species is a dimer of the 314Dalton species.

FIG. 6 depicts MS/MS analysis of the 314 Dalton and 328 Dalton species:The 314 species fragments into one species that has Gly+CH3CO and asecond species that is the expected size of the remainder of the fattyacid (labeled “-Gly”). The 328 species fragments into one species thathas Gly+CH3CO and a second species that is the expected size of theremainder of the fatty acid (labeled “-Gly”).

DESCRIPTION OF CERTAIN EMBODIMENTS Definitions

Acyl amino acid: The term “acyl amino acid” as used herein refers to anamino acid that is covalently linked to a fatty acid moiety. In someembodiments, the amino acid and fatty acid are covalently linked via anamide bond formed between a carboxylic acid group of a fatty acid and anamino group of an amino acid. In some embodiments, a fatty acid moietyor entity utilized or included in an acyl amino acid includes aβ-hydroxyl group; in some embodiments, a fatty acid moiety or entityutilized or included in an acyl amino acid does not include a β-hydroxylgroup. In some embodiments, a fatty acid moiety utilized or included inan acyl amino acid includes a β-amino group; in some embodiments, afatty acid moiety or entity utilized or included in an acyl amino aciddoes not include a β-amino group. In some embodiments, a fatty acidmoiety utilized or included in an acyl amino acid is unmodified at theβ-position.

Amino acid: As used herein, the term “amino acid,” in its broadestsense, refers to any compound and/or substance that can be utilized inpeptide synthesis (e.g., ribosomal or non-ribosomal synthesis). In someembodiments, an amino acid is any compound and/or substance that can beincorporated into a polypeptide chain, e.g., through formation of one ormore peptide bonds. In some embodiments, an amino acid is any compoundand/or substance that is a substrate for a peptide synthetase; in somesuch embodiments, an amino acid is any compound and/or substance ontowhich a peptide synthetase can link an acyl entity, for example throughformation of an amide bond. In some embodiments, an amino acid has thegeneral structure H₂N—C(H)(R)—COOH. In some embodiments, an amino acidis a naturally-occurring amino acid. In some embodiments, an amino acidis a synthetic amino acid; in some embodiments, an amino acid is aD-amino acid; in some embodiments, an amino acid is an L-amino acid.“Standard amino acid” refers to any of the twenty standard L-amino acidscommonly found in naturally occurring peptides. “Nonstandard amino acid”refers to any amino acid, other than the standard amino acids,regardless of whether it is prepared synthetically or obtained from anatural source. In some embodiments, an amino acid, including a carboxy-and/or amino-terminal amino acid in a polypeptide, can contain astructural modification as compared with the general structure above.For example, in some embodiments, an amino acid may be modified bymethylation, amidation, acetylation, and/or substitution as comparedwith the general structure. In some embodiments, such modification may,for example, alter the circulating half life of a polypeptide containingthe modified amino acid as compared with one containing an otherwiseidentical unmodified amino acid. In some embodiments, such modificationdoes not significantly alter a relevant activity of a polypeptidecontaining the modified amino acid, as compared with one containing anotherwise identical unmodified amino acid. As will be clear fromcontext, in some embodiments, the term “amino acid” is used to refer toa free amino acid; in some embodiments it is used to refer to an aminoacid residue of a polypeptide. In some embodiments, a “naturallyoccurring” amino acid is one of the standard group of twenty amino acidsthat are the building blocks of polypeptides of most organisms,including alanine, arginine, asparagine, aspartic acid, cysteine,glutamic acid, glutamine, glycine, histidine, isoleucine, leucine,lysine, methionine, phenylalanine, proline, serine, threonine,tryptophan, tyrosine, and valine. In certain embodiments a “naturallyoccurring” amino acid may be one of those amino acids that are used lessfrequently and are typically not included in this standard group oftwenty but are nevertheless still used by one or more organisms andincorporated into certain polypeptides. For example, the codons UAG andUGA normally encode stop codons in most organisms. However, in someorganisms the codons UAG and UGA encode the amino acids selenocysteineand pyrrolysine. Thus, in certain embodiments, selenocysteine andpyrrolysine are naturally occurring amino acids.

Associated with: Two events or entities are “associated” with oneanother, as that term is used herein, if the presence, level and/or formof one is correlated with that of the other. For example, a particularentity (e.g., polypeptide) is considered to be associated with aparticular disease, disorder, or condition, if its presence, leveland/or form correlates with incidence of and/or susceptibility of thedisease, disorder, or condition (e.g., across a relevant population). Insome embodiments, two or more entities are physically “associated” withone another if they interact, directly or indirectly, so that they areand remain in physical proximity with one another. In some embodiments,two or more entities that are physically associated with one another arecovalently linked to one another; in some embodiments, two or moreentities that are physically associated with one another are notcovalently linked to one another but are non-covalently associated, forexample by means of hydrogen bonds, van der Waals interaction,hydrophobic interactions, magnetism, and combinations thereof.

Beta-hydroxy fatty acid linkage domain: The term “beta-hydroxy fattyacid linkage domain” as used herein refers to a polypeptide domain thatcovalently links a beta-hydroxy fatty acid to an amino acid to form anacyl amino acid. A variety of beta-hydroxy fatty acid linkage domainsare known to those skilled in the art. However, different beta-hydroxyfatty acid linkage domains often exhibit specificity for one or morebeta-hydroxy fatty acids. As one non-limiting example, the beta-hydroxyfatty acid linkage domain from surfactin synthetase is specific for thebeta-hydroxy myristic acid, which contains 13 to 15 carbons in the fattyacid chain. Thus, the beta-hydroxy fatty acid linkage domain fromsurfactin synthetase can be used in accordance with the presentinvention to construct an engineered polypeptide useful in thegeneration of an acyl amino acid that comprises the fatty acidbeta-hydroxy myristic acid.

Beta-hydroxy fatty acid: The term “beta-hydroxy fatty acid” as usedherein refers to a fatty acid chain comprising a hydroxy group at thebeta position of the fatty acid chain. As is understood by those skilledin the art, the beta position corresponds to the third carbon of thefatty acid chain, the first carbon being the carbon of the carboxylategroup. Thus, when used in reference to an acyl amino acid of the presentinvention, where the carboxylate moiety of the fatty acid has beencovalently attached to the nitrogen of the amino acid, the beta positioncorresponds to the carbon two carbons removed from the carbon having theester group. A beta-hydroxy fatty acid to be used in accordance with thepresent invention may contain any number of carbon atoms in the fattyacid chain. As non-limiting examples, a beta-hydroxy fatty acid maycontain 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 3, 14, 15, 15, 16, 17, 18, 19,20 or more carbon atoms. Beta-hydroxy fatty acids to be used inaccordance with the present invention may contain linear carbon chains,in which each carbon of the chain, with the exception of the terminalcarbon atom and the carbon attached to the nitrogen of the amino acid,is directly covalently linked to two other carbon atoms. Additionally oralternatively, beta-hydroxy fatty acids to be used in accordance withthe present invention may contain branched carbon chains, in which atleast one carbon of the chain is directly covalently linked to three ormore other carbon atoms. Beta-hydroxy fatty acids to be used inaccordance with the present invention may contain one or more doublebonds between adjacent carbon atoms. Alternatively, beta-hydroxy fattyacids to be used in accordance with the present invention may containonly single-bonds between adjacent carbon atoms. A non-limitingexemplary beta-hydroxy fatty acid that may be used in accordance withthe present invention is or comprises a beta-hydroxy, acid whichcontains 13 to 15 carbons in the fatty acid chain; in some embodiments,an exemplary beta-hydroxy fatty acid that may be used in accordance withthe present invention is or comprises myristic acid myristic is usuallyused to mean 14 carbons Those of ordinary skill in the art will be awareof various beta-hydroxy fatty acids that can be used in accordance withthe present invention. Different beta-hydroxy fatty acid linkage domainsthat exhibit specificity for other beta-hydroxy fatty acids (e.g.,naturally or non-naturally occurring beta-hydroxy fatty acids) may beused in accordance with the present invention to generate any acyl aminoacid of the practitioner's choosing.

Characteristic sequence element: As used herein, the phrase“characteristic sequence element” refers to a sequence element found ina polymer (e.g., in a polypeptide or nucleic acid) that represents acharacteristic portion of that polymer. In some embodiments, presence ofa characteristic sequence element correlates with presence or level of aparticular activity or property of the polymer. In some embodiments,presence (or absence) of a characteristic sequence element defines aparticular polymer as a member (or not a member) of a particular familyor group of such polymers. A characteristic sequence element typicallycomprises at least two monomers (e.g., amino acids or nucleotides). Insome embodiments, a characteristic sequence element includes at least 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,or more monomers (e.g., contiguously linked monomers). In someembodiments, a characteristic sequence element includes at least firstand second stretches of contiguous monomers spaced apart by one or morespacer regions whose length may or may not vary across polymers thatshare the sequence element.

Combination therapy. As used herein, the term “combination therapy”refers to those situations in which a subject is simultaneously exposedto two or more therapeutic agents. In some embodiments, such agents areadministered simultaneously; in some embodiments, such agents areadministered sequentially; in some embodiments, such agents areadministered in overlapping regimens.

Comparable. The term “comparable”, as used herein, refers to two or moreagents, entities, situations, sets of conditions, etc that may not beidentical to one another but that are sufficiently similar to permitcomparison therebetween so that conclusions may reasonably be drawnbased on differences or similarities observed. Those of ordinary skillin the art will understand, in context, what degree of identity isrequired in any given circumstance for two or more such agents,entities, situations, sets of conditions, etc to be consideredcomparable.

Corresponding to: As used herein, the term “corresponding to” is oftenused to designate the position/identity of a residue in a polymer, suchas an amino acid residue in a polypeptide or a nucleotide residue in anucleic acid. Those of ordinary skill will appreciate that, for purposesof simplicity, residues in such a polymer are often designated using acanonical numbering system based on a reference related polymer, so thata residue in a first polymer “corresponding to” a residue at position190 in the reference polymer, for example, need not actually be the190^(th) residue in the first polymer but rather corresponds to theresidue found at the 190^(th) position in the reference polymer; thoseof ordinary skill in the art readily appreciate how to identify“corresponding” amino acids, including through use of one or morecommercially-available algorithms specifically designed for polymersequence comparisons.

Domain, Polypeptide domain: The terms “domain” and “polypeptide domain”as used herein generally refer to polypeptide moieties that display aparticular activity, even when isolated (e.g., cleaved) from otherpolypeptides or polypeptide domains. In some embodiments, a polypeptidedomain folds into a particular discrete structure in three-dimensionalspace. In some embodiments, a polypeptide domain within a longerpolypeptide is separated from one or more other polypeptide domainswithin the longer polypeptide by virtue of a linker element, forexample, that may comprise a substantially unstructured stretch of aminoacids. In some embodiments, the terms refer to domains that naturallyoccur in longer polypeptides; in some embodiments, the term refers toengineered polypeptide moieties that correspond and/or show significanthomology and/or identity to such naturally occurring polypeptidemoieties, or to other reference polypeptide moieties (e.g., historicalengineered moieties). In some embodiments, an engineered domain thatcorresponds and/or shows significant homology and/or identity to anaturally occurring or other reference moiety shares a characteristicstructure (e.g., primary structure such as the amino acid sequence ofthe domain, and/or secondary, tertiary, quaternary, etc. structures);alternatively or additionally, such an engineered domain may exhibit oneor more distinct functions that it shares with its reference polypeptidemoieties. As will be understood by those skilled in the art, in manycases polypeptides are modular and are comprised of one or morepolypeptide domains; in some such embodiments, each domain exhibits oneor more distinct functions that contribute to the overall function ofthe polypeptide. In some embodiments, the structure and/or function ofmany such domains are known to those skilled in the art.

Engineered: The term “engineered” as used herein refers to anon-naturally occurring moiety that has been created by the hand of man.For example, in reference to a polypeptide, an “engineered polypeptide”refers to a polypeptide that has been designed and/or produced by thehand of man. In some embodiments, an engineered polypeptide has an aminoacid sequence that includes one or more sequence elements that do(es)not occur in nature. In some embodiments, an engineered polypeptide hasan amino acid sequence that includes one or more sequence elements thatdoes occur in nature, but that is present in the engineered polypeptidein a different sequence context (e.g., separated from at least onesequence to which it is linked in nature and/or linked with at least onesequence element to which it is not linked in nature) from that in whichit occurs in nature. In some embodiments, an engineered polypeptide isone in which naturally-occurring sequence element(s) is/are separatedfrom at least one sequence with which they/it is associated (e.g.,linked) in nature and/or is otherwise manipulated to comprise apolypeptide that does not exist in nature. In various embodiments, anengineered polypeptide comprises two or more covalently linkedpolypeptide domains. Typically such domains will be linked via peptidebonds, although the present invention is not limited to engineeredpolypeptides comprising polypeptide domains linked via peptide bonds,and encompasses other covalent linkages known to those skilled in theart. One or more covalently linked polypeptide domains of engineeredpolypeptides may be naturally occurring. Thus, in certain embodiments,engineered polypeptides of the present invention comprise two or morecovalently linked domains, at least one of which is naturally occurring.In certain embodiments, two or more naturally occurring polypeptidedomains are covalently linked to generate an engineered polypeptide. Forexample, naturally occurring polypeptide domains from two or moredifferent polypeptides may be covalently linked to generate anengineered polypeptide. In certain embodiments, naturally occurringpolypeptide domains of an engineered polypeptide are covalently linkedin nature, but are covalently linked in the engineered polypeptide in away that is different from the way the domains are linked nature. Forexample, two polypeptide domains that naturally occur in the samepolypeptide but which are separated by one or more intervening aminoacid residues may be directly covalently linked (e.g., by removing theintervening amino acid residues) to generate an engineered polypeptideof the present invention. Additionally or alternatively, two polypeptidedomains that naturally occur in the same polypeptide which are directlycovalently linked together (e.g., not separated by one or moreintervening amino acid residues) may be indirectly covalently linked(e.g., by inserting one or more intervening amino acid residues) togenerate an engineered polypeptide of the present invention. In certainembodiments, one or more covalently linked polypeptide domains of anengineered polypeptide may not exist naturally. For example, suchpolypeptide domains may be engineered themselves.

Fatty acid linkage domain: The term “fatty acid linkage domain” as usedherein refers to a polypeptide domain that covalently links a fatty acidto an amino acid to form an acyl amino acid. In some embodiments, afatty acid linkage domain is a condensation domain; in some embodimentssuch a fatty acid linkage domain is part of a single polypeptide or apolypeptide complex with at least or only an adenylkation domain, athiolation domain, or both. A variety of fatty acid linkage domains areknown in the art, such as for example, fatty acid linkage domainspresent in various peptide synthetase complexes that producelipopeptides. In certain embodiments, a fatty acid linkage domain linksa beta-hydroxy fatty acid to an amino acid; in some embodiments, a fattyacid linkage domain links a beta-amino fatty acid to an amino acid; insome embodiments, a fatty acid linkage domain links a fatty acid that isunmodified at the beta position to an amino acid. In some embodiments, afatty acid linkage domain catalyzes condensation of a fatty acid and anamino acid so that an amide both is formed, for example between acarboxylic acid moiety on a fatty acid and an amino moiety on an aminoacid.

Homology: As used herein, the term “homology” refers to the overallrelatedness between polymeric molecules, e.g., between nucleic acidmolecules (e.g., DNA molecules and/or RNA molecules) and/or betweenpolypeptide molecules. In some embodiments, polymeric molecules areconsidered to be “homologous” to one another if their sequences are atleast 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,90%, 95%, or 99% identical. In some embodiments, polymeric molecules areconsidered to be “homologous” to one another if their sequences are atleast 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,90%, 95%, or 99% similar (e.g., containing residues with relatedchemical properties at corresponding positions). For example, as is wellknown by those of ordinary skill in the art, certain amino acids aretypically classified as similar to one another as “hydrophobic” or“hydrophilic” amino acids, and/or as having “polar” or “non-polar” sidechains. Substitution of one amino acid for another of the same type mayoften be considered a “homologous” substitution. Typical amino acidcategorizations are summarized below:

Alanine Ala A nonpolar neutral 1.8 Arginine Arg R polar positive −4.5Asparagine Asn N polar neutral −3.5 Aspartic acid Asp D polar negative−3.5 Cysteine Cys C nonpolar neutral 2.5 Glutamic acid Glu E polarnegative −3.5 Glutamine Gln Q polar neutral −3.5 Glycine Gly G nonpolarneutral −0.4 Histidine His H polar positive −3.2 Isoleucine Ile Inonpolar neutral 4.5 Leucine Leu L nonpolar neutral 3.8 Lysine Lys Kpolar positive −3.9 Methionine Met M nonpolar neutral 1.9 PhenylalaninePhe F nonpolar neutral 2.8 Proline Pro P nonpolar neutral −1.6 SerineSer S polar neutral −0.8 Threonine Thr T polar neutral −0.7 TryptophanTrp W nonpolar neutral −0.9 Tyrosine Tyr Y polar neutral −1.3 Valine ValV nonpolar neutral 4.2

Ambiguous Amino Acids 3-Letter 1-Letter Asparagine or aspartic acid AsxB Glutamine or glutamic acid Glx Z Leucine or Isoleucine Xle JUnspecified or unknown amino acid Xaa XAs will be understood by those skilled in the art, a variety ofalgorithms are available that permit comparison of sequences in order todetermine their degree of homology, including by permitting gaps ofdesignated length in one sequence relative to another when consideringwhich residues “correspond” to one another in different sequences.Calculation of the percent homology between two nucleic acid sequences,for example, can be performed by aligning the two sequences for optimalcomparison purposes (e.g., gaps can be introduced in one or both of afirst and a second nucleic acid sequences for optimal alignment andnon-corresponding sequences can be disregarded for comparison purposes).In certain embodiments, the length of a sequence aligned for comparisonpurposes is at least 30%, at least 40%, at least 50%, at least 60%, atleast 70%, at least 80%, at least 90%, at least 95%, or substantially100% of the length of the reference sequence. The nucleotides atcorresponding nucleotide positions are then compared. When a position inthe first sequence is occupied by the same nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position; when a position in the first sequence isoccupied by a similar nucleotide as the corresponding position in thesecond sequence, then the molecules are similar at that position. Thepercent homology between the two sequences is a function of the numberof identical and similar positions shared by the sequences, taking intoaccount the number of gaps, and the length of each gap, which needs tobe introduced for optimal alignment of the two sequences. Representativealgorithms and computer programs useful in determining the percenthomology between two nucleotide sequences include, for example, thealgorithm of Meyers and Miller (CABIOS, 1989, 4: 11-17), which has beenincorporated into the ALIGN program (version 2.0) using a PAM120 weightresidue table, a gap length penalty of 12 and a gap penalty of 4. Thepercent homology between two nucleotide sequences can, alternatively, bedetermined for example using the GAP program in the GCG software packageusing an NWSgapdna.CMP matrix.

Identity: As used herein, the term “identity” refers to the overallrelatedness between polymeric molecules, e.g., between nucleic acidmolecules (e.g., DNA molecules and/or RNA molecules) and/or betweenpolypeptide molecules. In some embodiments, polymeric molecules areconsidered to be “substantially identical” to one another if theirsequences are at least 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%,75%, 80%, 85%, 90%, 95%, or 99% identical. As will be understood bythose skilled in the art, a variety of algorithms are available thatpermit comparison of sequences in order to determine their degree ofhomology, including by permitting gaps of designated length in onesequence relative to another when considering which residues“correspond” to one another in different sequences. Calculation of thepercent identity between two nucleic acid sequences, for example, can beperformed by aligning the two sequences for optimal comparison purposes(e.g., gaps can be introduced in one or both of a first and a secondnucleic acid sequences for optimal alignment and non-correspondingsequences can be disregarded for comparison purposes). In certainembodiments, the length of a sequence aligned for comparison purposes isat least 30%, at least 40%, at least 50%, at least 60%, at least 70%, atleast 80%, at least 90%, at least 95%, or substantially 100% of thelength of the reference sequence. The nucleotides at correspondingnucleotide positions are then compared. When a position in the firstsequence is occupied by the same nucleotide as the correspondingposition in the second sequence, then the molecules are identical atthat position. The percent identity between the two sequences is afunction of the number of identical positions shared by the sequences,taking into account the number of gaps, and the length of each gap,which needs to be introduced for optimal alignment of the two sequences.Representative algorithms and computer programs useful in determiningthe percent identity between two nucleotide sequences include, forexample, the algorithm of Meyers and Miller (CABIOS, 1989, 4: 11-17),which has been incorporated into the ALIGN program (version 2.0) using aPAM120 weight residue table, a gap length penalty of 12 and a gappenalty of 4. The percent identity between two nucleotide sequences can,alternatively, be determined for example using the GAP program in theGCG software package using an NWSgapdna.CMP matrix.

Isolated: As used herein, the term “isolated” refers to a substanceand/or entity that has been (1) separated from at least some of thecomponents with which it was associated when initially produced (whetherin nature and/or in an experimental setting), and/or (2) designed,produced, prepared, and/or manufactured by the hand of man. Isolatedsubstances and/or entities may be separated from about 10%, about 20%,about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%,about 97%, about 98%, about 99%, or more than about 99% of the othercomponents with which they were initially associated. In someembodiments, isolated agents are about 80%, about 85%, about 90%, about91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%,about 98%, about 99%, or more than about 99% pure. As used herein, asubstance is “pure” if it is substantially free of other components. Insome embodiments, as will be understood by those skilled in the art, asubstance may still be considered “isolated” or even “pure”, afterhaving been combined with certain other components such as, for example,one or more carriers or excipients (e.g., buffer, solvent, water, etc.);in such embodiments, percent isolation or purity of the substance iscalculated without including such carriers or excipients. In someembodiments, isolation involves or requires disruption of covalent bonds(e.g., to isolate a polypeptide domain from a longer polypeptide and/orto isolate a nucleotide sequence element from a longer oligonucleotideor nucleic acid).

Naturally occurring: The term “naturally occurring”, as used herein,refers to an agent or entity that is known to exist in nature.

Nucleic acid. As used herein, the term “nucleic acid,” in its broadestsense, refers to any compound and/or substance that is or can beincorporated into an oligonucleotide chain. In some embodiments, anucleic acid is a compound and/or substance that is or can beincorporated into an oligonucleotide chain via a phosphodiester linkage.As will be clear from context, in some embodiments, “nucleic acid”refers to individual nucleic acid residues (e.g., nucleotides and/ornucleosides); in some embodiments, “nucleic acid” refers to anoligonucleotide chain comprising individual nucleic acid residues. Insome embodiments, a “nucleic acid” is or comprises RNA; in someembodiments, a “nucleic acid” is or comprises DNA. In some embodiments,a nucleic acid is, comprises, or consists of one or more natural nucleicacid residues. In some embodiments, a nucleic acid is, comprises, orconsists of one or more nucleic acid analogs. In some embodiments, anucleic acid analog differs from a nucleic acid in that it does notutilize a phosphodiester backbone. For example, in some embodiments, anucleic acid is, comprises, or consists of one or more “peptide nucleicacids”, which are known in the art and have peptide bonds instead ofphosphodiester bonds in the backbone, are considered within the scope ofthe present invention. Alternatively or additionally, in someembodiments, a nucleic acid has one or more phosphorothioate and/or5′-N-phosphoramidite linkages rather than phosphodiester bonds. In someembodiments, a nucleic acid is, comprises, or consists of one or morenatural nucleosides (e.g., adenosine, thymidine, guanosine, cytidine,uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, anddeoxycytidine). In some embodiments, a nucleic acid is, comprises, orconsists of one or more nucleoside analogs (e.g., 2-aminoadenosine,2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine,5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine,2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine,C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine,2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine,8-oxoguanosine, O(6)-methylguanine, 2-thiocytidine, methylated bases,intercalated bases, and combinations thereof). In some embodiments, anucleic acid comprises one or more modified sugars (e.g.,2′-fluororibose, ribose, 2′-deoxyribose, arabinose, and hexose) ascompared with those in natural nucleic acids. In some embodiments, anucleic acid has a nucleotide sequence that encodes a functional geneproduct such as an RNA or protein. In some embodiments, a nucleic acidincludes one or more introns. In some embodiments, nucleic acids areprepared by one or more of isolation from a natural source, enzymaticsynthesis by polymerization based on a complementary template (in vivoor in vitro), reproduction in a recombinant cell or system, and chemicalsynthesis. In some embodiments, a nucleic acid is at least 3, 4, 5, 6,7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85,90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 20, 225, 250,275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900,1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000 or more residueslong.

Peptide synthetase complex: The term “peptide synthetase complex” asused herein refers to an enzyme that catalyzes the non-ribosomalproduction of peptides. As will be appreciated by those of ordinaryskill in the art, peptide synthetase complexes are modular, and arecomprised of individual peptide synthetase modules that performdifferent steps in the synthesis of the ultimate peptide; typically,each module performs one step (e.g., adds a single amino acid). Apeptide synthetase complex may comprise a single enzymatic subunit(e.g., a single polypeptide), or may comprise two or more enzymaticsubunits (e.g., two or more polypeptides). A peptide synthetase complextypically comprises at least one peptide synthetase domain, and mayfurther comprise one or more additional domains such as for example, afatty acid linkage domain, a thioesterase domain, a reductase domain,etc. Peptide synthetase domains of a peptide synthetase complex maycomprise two or more enzymatic subunits, with two or more peptidesynthetase domains present in a given enzymatic subunit. For example thesurfactin peptide synthetase complex (also referred to herein simply as“surfactin synthetase complex”) comprises three distinct polypeptideenzymatic subunits: the first two subunits comprise three peptidesynthetase domains, while the third subunit comprises a single peptidesynthetase domain.

Peptide synthetase domain: The term “peptide synthetase domain” as usedherein refers to a polypeptide domain that minimally comprises threedomains: an adenylation (A) domain, responsible for selectivelyrecognizing and activating a specific amino acid, a thiolation (T)domain, which tethers the activated amino acid to a cofactor viathioester linkage, and a condensation (C) domain, which links aminoacids joined to successive units of the peptide synthetase by theformation of amide bonds. A peptide synthetase domain typicallyrecognizes and activates a single, specific amino acid, and in thesituation where the peptide synthetase domain is not the first domain inthe pathway, links the specific amino acid to the growing peptide chain.

Polypeptide: The term “polypeptide” as used herein refers to a series ofamino acids joined together in peptide linkages. In some embodiments, a“polypeptide” has a structure as achieve through synthesis by ribosomalmachinery in naturally occurring organisms. In some embodiments a“polypeptide” has a structure as achieved through chemical synthesis(e.g., in vitro). In some embodiments, a “polypeptide” has a structureas achieved through joining of a series of amino acids joined togetherby non-ribosomal machinery, such as by way of non-limiting example,polypeptides synthesized by peptide synthetases. Such non-ribosomallyproduced polypeptides exhibit a greater diversity in covalent linkagesthan polypeptides synthesized by ribosomes (although those skilled inthe art will understand that the amino acids of ribosomally-producedpolypeptides may also be linked by covalent bonds that are not peptidebonds, such as the linkage of cystines via di-sulfide bonds). In someembodiments, the term is used to refer to specific functional classes ofpolypeptides, such as, for example, autoantigen polypeptides, nicotinicacetylcholine receptor polypeptides, alloantigen polypeptides, etc. Foreach such class, the present specification provides several examples ofamino acid sequences of known exemplary polypeptides within the class;in some embodiments, such known polypeptides are reference polypeptidesfor the class. In such embodiments, the term “polypeptide” refers to anymember of the class that shows significant sequence homology or identitywith a relevant reference polypeptide. In many embodiments, such memberalso shares significant activity with the reference polypeptide. Forexample, in some embodiments, a member polypeptide shows an overalldegree of sequence homology or identity with a reference polypeptidethat is at least about 30-40%, and is often greater than about 50%, 60%,70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or moreand/or includes at least one region (i.e., a conserved region, oftenincluding a characteristic sequence element) that shows very highsequence identity, often greater than 90% or even 95%, 96%, 97%, 98%, or99%. Such a conserved region usually encompasses at least 3-4 and oftenup to 20 or more amino acids; in some embodiments, a conserved regionencompasses at least one stretch of at least 2, 3, 4, 5, 6, 7, 8, 9, 10,11, 12, 13, 14, 15 or more contiguous amino acids. Polypeptides can betwo or more amino acids in length, although most polypeptides producedby ribosomes and peptide synthetases are longer than two amino acids.For example, in some embodiments, polypeptides may be 2, 3, 4, 5, 6, 7,8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85,90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900,1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000 or moreamino acids in length.

Reductase Domain: The term “reductase domain” as used herein refers to apolypeptide domain that catalyzes release of an acyl amino acid producedby a peptide synthetase complex from the peptide synthetase complex. Incertain embodiments, a reductase domain is covalently linked to apeptide synthetase domain and a fatty acid linkage domain such as abeta-hydroxy fatty acid linkage domain to generate an engineeredpolypeptide useful in the synthesis of an acyl amino acid. A variety ofreductase domains are found in nature in nonribosomal peptide synthetasecomplexes from a variety of species. A non-limiting example of areductase domain that may be used in accordance with the presentinvention includes the reductase domain from linear gramicidin(ATCC8185). However, any reductase domain that releases an acyl aminoacid produced by a peptide synthetase complex from the peptidesynthetase complex may be used in accordance with the present invention.In some embodiments, reductase domains are characterized by the presenceof the consensus sequence:[LIVSPADNK]-x(9)-{P}-x(2)-Y-[PSTAGNCV]-[STAGNQCIVM]-[STAGC]-K-{PC}-[SAGFYR]-[LIVMSTAGD]-x-{K}-[LIVMFYW]-{D}-x-{YR}-[LIVMFYWGAPTHQ]-[GSACQRHM](SEQ ID NO: 1), where square brackets (“[ ]”) indicate amino acids thatare typically present at that position, squiggly brackets (“{ }”)indicate amino acids that amino acids that are typically not present atthat position, and “x” denotes any amino acid or a gap. X(9) for exampledenotes any amino acids or gaps for nine consecutive positions. Thoseskilled in the art will be aware of methods to determine whether a givepolypeptide domain is a reductase domain.

Small molecule: As used herein, the term “small molecule” means a lowmolecular weight organic compound that may serve as an enzyme substrateor regulator of biological processes. In general, a “small molecule” isa molecule that is less than about 5 kilodaltons (kD) in size. In someembodiments, provided nanoparticles further include one or more smallmolecules. In some embodiments, the small molecule is less than about 4kD, 3 kD, about 2 kD, or about 1 kD. In some embodiments, the smallmolecule is less than about 800 daltons (D), about 600 D, about 500 D,about 400 D, about 300 D, about 200 D, or about 100 D. In someembodiments, a small molecule is less than about 2000 g/mol, less thanabout 1500 g/mol, less than about 1000 g/mol, less than about 800 g/mol,or less than about 500 g/mol. In some embodiments, one or more smallmolecules are encapsulated within the nanoparticle. In some embodiments,small molecules are non-polymeric. In some embodiments, in accordancewith the present invention, small molecules are not proteins,polypeptides, oligopeptides, peptides, polynucleotides,oligonucleotides, polysaccharides, glycoproteins, proteoglycans, etc. Insome embodiments, a small molecule is a therapeutic. In someembodiments, a small molecule is an adjuvant. In some embodiments, asmall molecule is a drug.

Surfactin: Surfactin is cyclic lipopeptide that is naturally produced bycertain bacteria, including the Gram-positive endospore-forming bacteriaBacillus subtilis. Surfactin is an amphiphilic molecule (having bothhydrophobic and hydrophilic properties) and is thus soluble in bothorganic solvents and water. Surfactin exhibits exceptional surfactantproperties, making it a commercially valuable molecule. Due to itssurfactant properties, surfactin also functions as an antibiotic. Forexample, surfactin is known to be effective as an anti-bacterial,anti-viral, anti-fungal, anti-mycoplasma and hemolytic compound.Surfactin is capable of penetrating the cell membranes of all types ofbacteria, including both Gram-negative and Gram-positive bacteria, whichdiffer in the composition of their membrane. Gram-positive bacteria havea thick peptidoglycan layer on the outside of their phospholipidbilayer. In contrast, Gram-negative bacteria have a thinnerpeptidoglycan layer on the outside of their phospholipid bilayer, andfurther contain an additional outer lipopolysaccharide membrane.Surfactin's surfactant activity permits it to create a permeableenvironment for the lipid bilayer and causes disruption that solubilizesthe membrane of both types of bacteria. In order for surfactin to carryout minimal antibacterial effects, the minimum inhibitory concentration(MIC) is in the range of 12-50 μg/ml. In addition to its antibacterialproperties, surfactin also exhibits antiviral properties, and its knownto disrupt enveloped viruses such as HIV and HSV. Surfactin not onlydisrupts the lipid envelope of viruses, but also their capsids throughion channel formations. Surfactin isoforms containing fatty acid chainswith 14 or 15 carbon atoms exhibited improved viral inactivation,thought to be due to improved disruption of the viral envelope.Surfactin consists of a seven amino acid peptide loop, and a hydrophobicfatty acid chain (beta-hydroxy myristic acid) that is thirteen tofifteen carbons long. The fatty acid chain allows permits surfactin topenetrate cellular membranes. The peptide loop comprises the amino acidsL-asparagine, L-leucine, glycine, L-leucine, L-valine and twoD-leucines. Glycine and asparagine residues at positions 1 and 6respectively, constitute a minor polar domain. On the opposite side,valine residue at position 4 extends down facing the fatty acid chain,making up a major hydrophobic domain. Surfactin is synthesized by thesurfactin synthetase complex, which comprises the three surfactinsynthetase polypeptide subunits SrfA-A, SrfA-B, and SrfA-C. Thesurfactin synthetase polypeptide subunits SrfA-A and SrfA-B eachcomprise three peptide synthetase domains, each of which adds a singleamino acid to the growing surfactin peptide, while the monomodularsurfactin synthetase polypeptide subunit SrfA-C comprises a singlepeptide synthetase domain and adds the last amino acid residue to theheptapeptide. Additionally the SrfA-C subunit comprises a thioesterasedomain, which catalyzes the release of the product via a nucleophilicattack of the beta-hydroxy of the fatty acid on the carbonyl of theC-terminal Leu of the peptide, cyclizing the molecule via formation ofan ester. The spectrum of the beta-hydroxy fatty acids was elucidated asiso, anteiso C13, iso, normal C14 and iso, anteiso C15, and a recentstudy has indicated that surfactin retains an R configuration at C-beta(Nagai et al., Study on surfactin, a cyclic depsipeptide. 2. Synthesisof surfactin B2 produced by Bacillus natto KMD 2311. Chem Pharm Bull(Tokyo) 44: 5-10, 1996).

Surfactin is a lipopeptide synthesized by the surfactin synthetasecomplex. Surfactin comprises seven amino acids, which are initiallyjoined by peptide bonds, as well as a beta-hydroxy fatty acid covalentlylinked to the first amino acid, glutamate. However, upon addition thefinal amino acid (leucine), the polypeptide is released and thethioesterase domain of the SRFC protein catalyzes the release of theproduct via a nucleophilic attack of the beta-hydroxy of the fatty acidon the carbonyl of the C-terminal Leu of the peptide, cyclizing themolecule via formation of an ester, resulting in the C-terminus carboxylgroup of leucine attached via a lactone bond to the b-hydroxyl group ofthe fatty acid.

Thioesterase domain: The term “thioesterase domain” as used hereinrefers to a polypeptide domain that catalyzes release of an acyl aminoacid produced by a peptide synthetase complex from the peptidesynthetase complex. A variety of thioesterase domains are found innature in nonribosomal peptide synthetase complexes from a variety ofspecies. A non-limiting example of a thioesterase domain that may beused in accordance with the present invention includes the thioesterasedomain from the Bacillus subtilis surfactin synthetase complex, presentin Srf-C subunit. However, any thioesterase domain that releases an acylamino acid produced by a peptide synthetase complex from the peptidesynthetase complex may be used in accordance with the present invention.In some embodiments, thioesterase domains are characterized by thepresence of the consensus sequence:[LIV]-{KG}-[LIVFY]-[LIVMST]-G-[HYWV]-S-{YAG}-G-[GSTAC] (SEQ ID NO: 2),where square brackets (“[ ]”) indicate amino acids that are typicallypresent at that position, and squiggly brackets (“{ }”) indicate aminoacids that amino acids that are typically not present at that position.Those skilled in the art will be aware of methods to determine whether agive polypeptide domain is a thioesterase domain.

Engineered Polypeptides Useful in the Generation of Acyl Amino Acids

The present invention provides compositions and methods for thegeneration of acyl amino acids. In certain embodiments, compositions ofthe present invention comprise engineered polypeptides that are usefulin the production of acyl amino acids. In certain embodiments,engineered polypeptides of the present invention comprise a peptidesynthetase domain.

In one aspect, the present invention encompasses the recognition that asingle peptide synthetase domain, not associated (e.g., not associatedcovalently and/or not otherwise associated) with, for example, anotherdomain typically found in a peptide synthetase complex (e.g., a fattyacid linkage domain, a thioesterase domain, a reductase domain, etc.and/or a combination thereof), can be sufficient to produce an acylamino acid as described herein.

In accordance with many embodiments of the present invention, peptidesynthetase domains useful for the production of acyl amino acids asdescribed herein, correspond and/or show significant homology and/oridentity to a first peptide synthetase domain found in anaturally-occurring peptide synthetase complex. That is, as is known inthe art, some peptide synthetase domains (i.e., some polypeptidescomprising adenylation (A), thiolation (T), and condensation (C)domains) catalyze condensation of a fatty acid with an amino acid, andsome catalyze condensation of two amino acids with one another. Inaccordance with the some embodiments of the present invention, peptidesynthetase domains useful for the production of acyl amino acids asdescribed herein are those that catalyze condensation of an amino acidwith a fatty acid; such peptide synthetase domains are typicallyutilized herein in a form (e.g., as part of a polypeptide) that isseparated from and/or does not include another peptide synthetasedomain. Many naturally-occurring peptide synthetase domains are found innature within peptide synthetase complexes that synthesize lipopeptides.Such peptide synthetase complexes are multienzymatic complexes found inboth prokaryotes and eukaryotes, and comprising one or more enzymaticsubunits that catalyze the non-ribosomal production of a variety ofpeptides (see, for example, Kleinkauf et al., Annu. Rev. Microbiol.41:259-289, 1987; see also U.S. Pat. No. 5,652,116 and U.S. Pat. No.5,795,738). Non-ribosomal synthesis is also known as thiotemplatesynthesis (see e.g., Kleinkauf et al.). Peptide synthetase complexestypically include one or more peptide synthetase domains that recognizespecific amino acids and are responsible for catalyzing addition of theamino acid to the polypeptide chain.

The catalytic steps in the addition of amino acids typically include:recognition of an amino acid by the peptide synthetase domain,activation of the amino acid (formation of an amino-acyladenylate),binding of the activated amino acid to the enzyme via a thioester bondbetween the carboxylic group of the amino acid and an SH group of anenzymatic co-factor, which cofactor is itself bound to the enzyme insideeach peptide synthetase domain, and formation of the peptide bonds amongthe amino acids.

A peptide synthetase domain comprises subdomains that carry out specificroles in these steps to form the peptide product. One subdomain, theadenylation (A) domain, is responsible for selectively recognizing andactivating the amino acid that is to be incorporated by a particularunit of the peptide synthetase. The activated amino acid is joined tothe peptide synthetase through the enzymatic action of anothersubdomain, the thiolation (T) domain, that is generally located adjacentto the A domain. Amino acids joined to successive units of the peptidesynthetase are subsequently linked together by the formation of amidebonds catalyzed by another subdomain, the condensation (C) domain.

Peptide synthetase domains that catalyze the addition of D-amino acidsoften also have the ability to catalyze the recemization of L-aminoacids to D-amino acids. Peptide synthetase complexes also typicallyinclude a conserved thioesterase domain that terminates the growingamino acid chain and releases the product.

The genes that encode peptide synthetase complexes have a modularstructure that parallels the functional domain structure of thecomplexes (see, for example, Cosmina et al., Mol. Microbiol. 8:821,1993; Kratzxchmar et al., J. Bacteriol. 171:5422, 1989; Weckermann etal., Nuc. Acids res. 16:11841, 1988; Smith et al., EMBO J. 9:741, 1990;Smith et al., EMBO J. 9:2743, 1990; MacCabe et al., J. Biol. Chem.266:12646, 1991; Coque et al., Mol. Microbiol. 5:1125, 1991; Diez etal., J. Biol. Chem. 265:16358, 1990).

Hundreds of peptides are known to be produced by peptide synthetasecomplexes. Such nonribosomally-produced peptides often have non-linearstructures, including cyclic structures exemplified by the peptidessurfactin, cyclosporin, tyrocidin, and mycobacillin, or branched cyclicstructures exemplified by the peptides polymyxin and bacitracin.Moreover, such nonribosomally-produced peptides may contain amino acidsnot usually present in ribosomally-produced polypeptides such as forexample norleucine, beta-alanine and/or ornithine, as well as D-aminoacids. Additionally or alternatively, such nonribosomally-producedpeptides may comprise one or more non-peptide moieties that arecovalently linked to the peptide. As one non-limiting example, surfactinis a cyclic lipopeptide that comprises a beta-hydroxy fatty acidcovalently linked to the first glutamate of the lipopeptide. Othernon-peptide moieties that are covalently linked to peptides produced bypeptide synthetase complexes are known to those skilled in the art,including for example sugars, chlorine or other halogen groups, N-methyland N-formyl groups, glycosyl groups, acetyl groups, etc.

Typically, each amino acid of the non ribosomally-produced peptide isspecified by a distinct peptide synthetase domain. For example, thesurfactin synthetase complex which catalyzes the polymerization of thelipopeptide surfactin consists of three enzymatic subunits. The firsttwo subunits each comprise three peptide synthetase domains, whereas thethird has only one. These seven peptide synthetase domains areresponsible for the recognition, activation, binding and polymerizationof L-Glu, L-Leu, D-Leu, L-Val, L-Asp, D-Leu and L-Leu, the amino acidspresent in surfactin.

A similar organization in discrete, repeated peptide synthetase domainsoccurs in various peptide synthetase genes in a variety of species,including bacteria and fungi, for example srfA (Cosmina et al., Mol.Microbiol. 8, 821-831, 1993), grsA and grsB (Kratzxchmar et al., J.Bacterial. 171, 5422-5429, 1989) tycA and tycB (Weckermann et al., Nucl.Acid. Res. 16, 11841-11843, 1988) and ACV from various fungal species(Smith et al., EMBO J. 9, 741-747, 1990; Smith et al., EMBO J. 9,2743-2750, 1990; MacCabe et al., J. Biol. Chem. 266, 12646-12654, 1991;Coque et al., Mol. Microbiol. 5, 1125-1133, 1991; Diez et al., J. Biol.Chem. 265, 16358-16365, 1990). The peptide synthetase domains of evendistant species contain sequence regions with high homology, some ofwhich are conserved and specific for all the peptide synthetases.Additionally, certain sequence regions within peptide synthetase domainsare even more highly conserved among peptide synthetase domains whichrecognize the same amino acid (Cosmina et al., Mol. Microbiol. 8,821-831, 1992).

Exemplary lipopeptides synthesized by peptide synthetase complexes innature are listed below in Table 1 (See also the NORINE database, whichprovides access to information on peptides and lipopeptides that areknown to be, or in some cases believed to be, produced by peptidesynthetase enzymes; still further, see Segolene et al. (Ref 4)).

TABLE 1 Exemplary Lipopeptides Synthesized by Peptide SynthetasesLipopeptide Name Fatty Acid Component Fatty Acid Component name[Ala4]surfactin aC15 aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid[Ala4]surfactin iC14 iC14:0-OH(3) 3-hydroxy-12-methyl-tridecanoic acid[Ala4]surfactin iC15 iC15:0-OH(3) 3-hydroxy-13-methyl-tetradecanoic acid[Ala4]surfactin nC14 C14:0-OH(3) 3-hydroxy-tetradecanoic acid[Ala4]surfactin nC15 C15:0-OH(3) 3-hydroxy-pentadecanoic acid[Gln1]surfactin C15:0-OH(3) 3-hydroxy-pentadecanoic acid [Gln1]surfactinaC15 aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid [Gln1]surfactiniC15 iC15:0-OH(3) 3-hydroxy-13-methyl-tetradecanoic acid[Ile2.4.7]surfactin aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid[Ile4.7]surfactin aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid[Ile4]surfactin aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid[Ile7]surfactin aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid[Leu4]surfactin aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acid[Phe25]syringopeptin 25A C10:0-OH(3) 3-hydroxy-decanoic acid[Val7]surfactin aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acidA21978C1 aC11:0 8-methyldecanoic acid A21978C2 iC12:010-methylundecanoic acid A21978C3 aC13:0 10-methyldodecanoic acid A54145A iC10:0 decanoic acid A54145 A1 C10:0 decanoic acid A54145 B C10:0decanoic acid A54145 B1 iC10:0 decanoic acid A54145 C aC11:08-methyldecanoic acid A54145 D aC11:0 8-methyldecanoic acid A54145 EaC11:0 8-methyldecanoic acid A54145 F iC10:0 decanoic acid amphibactin BC14:0-OH(3) 3-hydroxy-tetradecanoic acid amphibactin C C16:1(9)-OH(3)3-hydroxy-9-hexadecenoic acid amphibactin D C14:0 tetradecanoic acidamphibactin E C16:1(9) 9-hexadecenoic acid amphibactin F C16:0-OH(3)3-hydroxy-hexadecanoic acid amphibactin G C18:1(9)-OH(3)3-hydroxy-9-octadecenoic acid amphibactin H C16:0 hexadecanoic acidamphibactin I C18:1(9) 9-octadecenoic acid amphisin C10:0-OH(3)3-hydroxy-decanoic acid amphomycin A1437 A iC13:1(3)11-methyl-3-dodecenoic acid amphomycin A1437 B iC14:1(3)12-methyl-3-tridecenoic acid amphomycin A1437 D aC15:1(3)12-methyl-3-tetradecenoic acid amphomycin A1437 E aC13:1(3)10-methyl-3-dodecenoic acid apramide A C8:0:1(7)-Me(2)2-methylact-7-ynoic acid apramide B C8:0:1(7) oct-7-ynoic acid apramideC C9:1(8)-Me(2) 2-methyl-8-noneic acid apramide D C8:0:1(7)-Me(2)2-methylact-7-ynoic acid apramide E C8:0:1(7) oct-7-ynoic acid apramideF C9:1(8)-Me(2) 2-methyl-8-noneic acid apramide G C8:0:1(7)-Me(2)2-methylact-7-ynoic acid aquachelin A C12:1(5) 2-methyl-5-dodecenoicacid aquachelin B C12:0 dodecanoic acid aquachelin C C14:1(7)7-tetradecenoic acid aquachelin D C14:0 tetradecanoic acid arthrofactinC10:0-OH(3) 3-hydroxy-decanoic acid arylomycin A1 iC11:09-methyldecanoic acid arylomycin A2 iC12:0 10-methylundecanoic acidarylomycin A3 C12:0 dodecanoic acid arylomycin A4 aC13:010-methyldodecanoic acid arylomycin A5 iC14:0 12-methyl-tridecanoic acidarylomycin B1 iC11:0 9-methyldecanoic acid arylomycin B2 iC12:010-methylundecanoic acid arylomycin B3 C12:0 dodecanoic acid arylomycinB4 aC13:0 10-methyldodecanoic acid arylomycin B5 iC13:011-methyldodecanoic acid arylomycin B6 iC14:0 12-methyl-tridecanoic acidarylomycin B7 aC15:0 12-methyltetradecanoic acid bacillomycin D-1C14:0-NH2(3) 3-amino-tetradecanoic acid bacillomycin D-2 iC15:0-NH2(3)3-amino-13-methyl-tetradecanoic acid bacillomycin D-3 aC15:0-NH2(3)3-amino-12-methyl-tetradecanoic acid bacillomycin D-4 C16:0-NH2(3)3-amino-hexadecanoic acid bacillomycin D-5 iC16:0-NH2(3)3-amino-14-methyl-pentadecanoic acid bacillomycin F-1 iC15:0-NH2(3)3-amino-13-methyl-tetradecanoic acid bacillomycin F-2 aC15:0-NH2(3)3-amino-12-methyl-tetradecanoic acid bacillomycin F-3 iC16:0-NH2(3)3-amino-14-methyl-pentadecanoic acid bacillomycin F-4 C16:0-NH2(3)3-amino-hexadecanoic acid bacillomycin F-5 iC17:0-NH2(3)3-amino-15-methyl-hexadecanoic acid bacillomycin F-6 aC17:0-NH2(3)3-amino-14-methyl-hexadecanoic acid bacillomycin L-1 C14:0-NH2(3)3-amino-tetradecanoic acid bacillomycin L-2 iC15:0-NH2(3)3-amino-13-methyl-tetradecanoic acid bacillomycin L-3 aC15:0-NH2(3)3-amino-12-methyl-tetradecanoic acid bacillomycin L-4 C16:0-NH2(3)3-amino-hexadecanoic acid bacillomycin L-5 iC16:0-NH2(3)3-amino-14-methyl-pentadecanoic acid beauverolide A C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide B C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide Ba C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide C C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide Ca C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide D C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide E C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide Ea C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide F C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide Fa C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide H C9:0-OH(3)3-hydroxy-nonanoic acid beauverolide I C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide II C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide III C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide IV C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide Ja C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide Ka C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide L C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide La C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid beauverolide M C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide N C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide V C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide VI C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide VII C8:0-Me(4)—OH(3)4-methyl-3-hydroxy-octanoic acid beauverolide VIII C10:0-Me(4)—OH(3)3-hydroxy-4-methyl-decanoic acid callipeltin A iC8:0-Me(2.4)—OH(3)2,4,6-trimethyl-3-hydroxy-heptanoic acid callipeltin CiC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoic acid callipeltinD iC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoic acidcallipeltin F iC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoicacid callipeltin G iC8:0-Me(2.4)—OH(3)2,4,6-trimethyl-3-hydroxy-heptanoic acid callipeltin HiC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoic acid callipeltinI iC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoic acidcallipeltin J iC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoicacid callipeltin K iC8:0-Me(2.4)—OH(3)2,4,6-trimethyl-3-hydroxy-heptanoic acid callipeltin LiC8:0-Me(2.4)—OH(3) 2,4,6-trimethyl-3-hydroxy-heptanoic acid carmabin AC10:0:1(9)-Me(2.4) 2,4-dimethyl-dec-9-ynoic acid carmabin BC10:0-Me(2.4)-oxo(9) 9-oxo-2,4-dimethyldecanoic acid CDA1b C6:0-Ep(2)2-epoxy-hexanoic acid CDA2a C6:0-Ep(2) 2-epoxy-hexanoic acid CDA2bC6:0-Ep(2) 2-epoxy-hexanoic acid CDA2d C6:0-Ep(2) 2-epoxy-hexanoic acidCDA2fa C6:0-Ep(2) 2-epoxy-hexanoic acid CDA2fb C6:0-Ep(2)2-epoxy-hexanoic acid CDA3a C6:0-Ep(2) 2-epoxy-hexanoic acid CDA3bC6:0-Ep(2) 2-epoxy-hexanoic acid CDA4a C6:0-Ep(2) 2-epoxy-hexanoic acidCDA4b C6:0-Ep(2) 2-epoxy-hexanoic acid cormycin A C16:0-OH(3.4)3,4-dihydroxy-hexadecanoic acid corpeptin A C10:0-OH(3)3-hydroxy-decanoic acid corpeptin B C12:1(5)-OH(3)3-hydroxy-5-dodecenoic acid corrugatin C8:0 octanoic acid daptomycinC10:0 decanoic acid enduracidin A iC12:2(2.t4)10-methyl-2,trans4-undecanoic acid enduracidin B aC13:2(2.t4)10-methyl-2,trans4-dodecenoic acid fengycin A C16:0-OH(3)3-hydroxy-hexadecanoic acid fengycin B C16:0-OH(3)3-hydroxy-hexadecanoic acid friulimicin A iC13:1(3)11-methyl-3-dodecenoic acid friulimicin B iC14:1(3)12-methyl-3-tridecenoic acid friulimicin C aC13:1(3)10-methyl-3-dodecenoic acid friulimicin D aC15:1(3)12-methyl-3-tetradecenoic acid fuscopeptin A C8:0-OH(3)3-hydroxy-octanoic acid fuscopeptin B C10:0-OH(3) 3-hydroxy-decanoicacid Ile-polymyxin B1 aC9:0 6-methyloctanoic acid Ile-polymyxin E1 aC9:06-methyloctanoic acid Ile-polymyxin E2 iC8:0 6-methylheptanoic acidIle-polymyxin E8 aC10:0 8-methyldecanoic acid iturin A-1 C13:0-NH2(3)3-amino-tridecanoic acid iturin A-2 C14:0-NH2(3) 3-amino-tetradecanoicacid iturin A-3 aC15:0-NH2(3) 3-amino-12-methyl-tetradecanoic aciditurin A-4 iC15:0-NH2(3) 3-amino-13-methyl-tetradecanoic acid iturin A-5C15:0-NH2(3) 3-amino-pentadecanoic acid iturin A-6 iC16:0-NH2(3)3-amino-14-methyl-pentadecanoic acid iturin A-7 C16:0-NH2(3)3-amino-hexadecanoic acid iturin A-8 aC17:0-NH2(3)3-amino-14-methyl-hexadecanoic acid iturin C-1 iC14:0-NH2(3)3-amino-12-methyl-tridecanoic acid iturin C-2 aC15:0-NH2(3)3-amino-12-methyl-tetradecanoic acid iturin C-3 iC16:0-NH2(3)3-amino-14-methyl-pentadecanoic acid iturin C-4 aC17:0-NH2(3)3-amino-14-methyl-hexadecanoic acid kulomo opunalide 1C8:0:1(7)-Me(2)—OH(3) 2-methyl-3-hydroxy-7-octynoic acid kulomoopunalide 2 C8:0:1(7)-Me(2)—OH(3) 2-methyl-3-hydroxy-7-octynoic acidlichenysin A aC13 aC13:0-OH(3) 3-hydroxy-10-methyl-dodecanoic acidlichenysin A aC15 aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acidlichenysin A aC17 aC17:0-OH(3) 3-hydroxy-14-methyl-hexadecanoic acidlichenysin A iC12 iC12:0-OH(3) 3-hydroxy-10-methyl-undecanoic acidlichenysin A iC13 iC13:0-OH(3) 3-hydroxy-11-methyl-dodecanoic acidlichenysin A iC14 iC14:0-OH(3) 3-hydroxy-12-methyl-tridecanoic acidlichenysin A iC15 iC15:0-OH(3) 3-hydroxy-13-methyl-tetradecanoic acidlichenysin A iC16 iC16:0-OH(3) 3-hydroxy-14-methyl-pentadecanoic acidlichenysin A iC17 iC17:0-OH(3) 3-hydroxy-15-methyl-hexadecanoic acidlichenysin A nC12 C12:0-OH(3) 3-hydroxy-dodecanoic acid lichenysin AnC13 C13:0-OH(3) 3-hydroxy-tridecanoic acid lichenysin A nC14C14:0-OH(3) 3-hydroxy-tetradecanoic acid lichenysin A nC15 C15:0-OH(3)3-hydroxy-pentadecanoic acid lichenysin A nC16 C16:0-OH(3)3-hydroxy-hexadecanoic acid lokisin C10:0-OH(3) 3-hydroxy-decanoic acidmarinobactin A C12:0 dodecanoic acid marinobactin B C14:1(7)7-tetradecenoic acid marinobactin C C14:0 tetradecanoic acidmarinobactin D1 C16:1(9) 9-hexadecenoic acid marinobactin D2 C16:1(7)7-hexadecenoic acid marinobactin E C16:0 hexadecanoic acid massetolide AC10:0-OH(3) 3-hydroxy-decanoic acid massetolide B C11:0-OH(3)3-hydroxy-undecanoic acid massetolide C C12:0-OH(3) 3-hydroxy-dodecanoicacid massetolide D C10:0-OH(3) 3-hydroxy-decanoic acid massetolide EC10:0-OH(3) 3-hydroxy-decanoic acid massetolide F C10:0-OH(3)3-hydroxy-decanoic acid massetolide G C11:0-OH(3) 3-hydroxy-undecanoicacid massetolide H C12:0-OH(3) 3-hydroxy-dodecanoic acid massetolide LC10:0-OH(3) 3-hydroxy-decanoic acid mycosubtilin 1 C16:0-NH2(3)3-amino-hexadecanoic acid mycosubtilin 2 iC16:0-NH2(3)3-amino-14-methyl-pentadecanoic acid mycosubtilin 3 iC17:0-NH2(3)3-amino-15-methyl-hexadecanoic acid mycosubtilin 4 aC17:0-NH2(3)3-amino-14-methyl-hexadecanoic acid neamphamide A iC8:0-Me(2.4)—OH(3)2,4,6-trimethyl-3-hydroxy-heptanoic acid Nva-polymyxin E1 aC9:06-methyloctanoic acid papuamide A aC11:2(4.6)-Me(2.6)—OH(2.3)2,3-dihydroxy-2,6,8-trimethyldeca- (4Z,6E)-dienoic acid papuamide BaC11:2(4.6)-Me(2.6)—OH(2.3) 2,3-dihydroxy-2,6,8-trimethyldeca-(4Z,6E)-dienoic acid papuamide C aC11:2(4.6)-Me(2.6)—OH(2.3)2,3-dihydroxy-2,6,8-trimethyldeca- (4Z,6E)-dienoic acid papuamide DaC11:2(4.6)-Me(2.6)—OH(2.3) 2,3-dihydroxy-2,6,8-trimethyldeca-(4Z,6E)-dienoic acid pholipeptin C10:0-OH(3) 3-hydroxy-decanoic acidplusbacin A1 C14:0-OH(3) 3-hydroxy-tetradecanoic acid plusbacin A2iC15:0-OH(3) 3-hydroxy-13-methyl-tetradecanoic acid plusbacin A3iC16:0-OH(3) 3-hydroxy-14-methyl-pentadecanoic acid plusbacin A4C16:0-OH(3) 3-hydroxy-hexadecanoic acid plusbacin B1 C14:0-OH(3)3-hydroxy-tetradecanoic acid plusbacin B2 iC15:0-OH(3)3-hydroxy-13-methyl-tetradecanoic acid plusbacin B3 iC16:0-OH(3)3-hydroxy-14-methyl-pentadecanoic acid plusbacin B4 C16:0-OH(3)3-hydroxy-hexadecanoic acid polymyxin B1 aC9:0 6-methyloctanoic acidpolymyxin B2 iC8:0 6-methylheptanoic acid polymyxin B3 C8:0 octanoicacid polymyxin B4 C7:0 heptanoic acid polymyxin B5 C9:0 nonanoic acidpolymyxin B6 aC9:0-OH(3) 3-hydroxy-6-methyloctanoic acid polymyxin E1aC9:0 6-methyloctanoic acid polymyxin E2 iC8:0 6-methylheptanoic acidpolymyxin E3 C8:0 octanoic acid polymyxin E4 C7:0 heptanoic acidpolymyxin E7 iC9:0 7-methyloctanoic acid polymyxin M aC9:06-methyloctanoic acid pseudomycin A C14:0-OH(3.4)3,4-dihydroxy-tetradecanoic acid pseudomycin B C14:0-OH(3)3-hydroxy-tetradecanoic acid pseudomycin C C16:0-OH(3.4)3,4-dihydroxy-hexadecanoic acid pseudomycin C2 C16:0-OH(3)3-hydroxy-hexadecanoic acid pseudophomin A C10:0-OH(3)3-hydroxy-decanoic acid pseudophomin B C12:0-OH(3) 3-hydroxy-dodecanoicacid putisolvin I C6:0 hexanoic acid putisolvin II C6:0 hexanoic acidputisolvin III C6:0 hexanoic acid ramoplanin A1 C8:2(2.t4)2,trans4-octenoic acid ramoplanin A2 iC9:2(2.t4)2,trans4-7-methyl-octenoic acid ramoplanin A3 iC10:2(2.t4)2,trans4-8-methyl-noneoic acid serrawettin W1 C10:0-OH(3)3-hydroxy-decanoic acid serrawettin W2 C10:0-OH(3) 3-hydroxy-decanoicacid surfactin aC13 aC13:0-OH(3) 3-hydroxy-10-methyl-dodecanoic acidsurfactin aC15 aC15:0-OH(3) 3-hydroxy-12-methyl-tetradecanoic acidsurfactin iC12 iC12:0-OH(3) 3-hydroxy-10-methyl-undecanoic acidsurfactin iC14 iC14:0-OH(3) 3-hydroxy-12-methyl-tridecanoic acidsurfactin iC15 iC15:0-OH(3) 3-hydroxy-13-methyl-tetradecanoic acidsurfactin iC16 iC16:0-OH(3) 3-hydroxy-14-methyl-pentadecanoic acidsurfactin nC13 C13:0-OH(3) 3-hydroxy-tridecanoic acid surfactin nC14C14:0-OH(3) 3-hydroxy-tetradecanoic acid surfactin nC15 C15:0-OH(3)3-hydroxy-pentadecanoic acid syringafactin A C10:0-OH(3)3-hydroxy-decanoic acid syringafactin B C10:0-OH(3) 3-hydroxy-decanoicacid syringafactin C C10:0-OH(3) 3-hydroxy-decanoic acid syringafactin DC12:0-OH(3) 3-hydroxy-dodecanoic acid syringafactin E C12:0-OH(3)3-hydroxy-dodecanoic acid syringafactin F C12:0-OH(3)3-hydroxy-dodecanoic acid syringomycin A1 C10:0-OH(3) 3-hydroxy-decanoicacid syringomycin E C12:0-OH(3) 3-hydroxy-dodecanoic acid syringomycin GC14:0-OH(3) 3-hydroxy-tetradecanoic acid syringopeptin 22 PhvAC10:0-OH(3) 3-hydroxy-decanoic acid syringopeptin 22 PhvB C12:0-OH(3)3-hydroxy-dodecanoic acid syringopeptin 22A C10:0-OH(3)3-hydroxy-decanoic acid syringopeptin 22B C12:0-OH(3)3-hydroxy-dodecanoic acid syringopeptin 25A C10:0-OH(3)3-hydroxy-decanoic acid syringopeptin 25B C12:0-OH(3)3-hydroxy-dodecanoic acid syringopeptin 508A C12:0-OH(3)3-hydroxy-dodecanoic acid syringopeptin 508B C14:0-OH(3)3-hydroxy-tetradecanoic acid syringopeptin SC 1 C10:0-OH(3)3-hydroxy-decanoic acid syringopeptin SC 2 C12:0-OH(3)3-hydroxy-dodecanoic acid syringostatin A C14:0-OH(3)3-hydroxy-tetradecanoic acid syringostatin B C14:0-OH(3.4)3,4-dihydroxy-tetradecanoic acid syringotoxin B C14:0-OH(3)3-hydroxy-tetradecanoic acid tensin C10:0-OH(3) 3-hydroxy-decanoic acidtolaasin A Pda pentanedioic acid tolaasin B C8:0-OH(3)3-hydroxy-octanoic acid tolaasin C C8:0-OH(3) 3-hydroxy-octanoic acidtolaasin D C8:0-OH(3) 3-hydroxy-octanoic acid tolaasin E C8:0-OH(3)3-hydroxy-octanoic acid tolaasin I C8:0-OH(3) 3-hydroxy-octanoic acidtolaasin II C8:0-OH(3) 3-hydroxy-octanoic acid tripropeptin AiC13:0-OH(3) 3-hydroxy-11-methyl-dodecanoic acid tripropeptin BiC14:0-OH(3) 3-hydroxy-12-methyl-tridecanoic acid tripropeptin CiC15:0-OH(3) 3-hydroxy-13-methyl-tetradecanoic acid tripropeptin DiC16:0-OH(3) 3-hydroxy-14-methyl-pentadecanoic acid tripropeptin EiC17:0-OH(3) 3-hydroxy-15-methyl-hexadecanoic acid tripropeptin ZiC12:0-OH(3) 3-hydroxy-10-methyl-undecanoic acid Val-polymyxin E1 aC9:06-methyloctanoic acid Val-polymyxin E2 iC8:0 6-methylheptanoic acidviscosin C10:0-OH(3) 3-hydroxy-decanoic acid viscosinamide C10:0-OH(3)3-hydroxy-decanoic acid White Line Inducing Principle C10:0-OH(3)3-hydroxy-decanoic acid

The present invention appreciates that, typically, in peptide synthetasecomplexes that synthesize lipopeptides, the first active peptidesynthetase domain is the one that links a fatty acid to an amino acid;subsequent peptide synthetase domains typically add additional aminoacids. In accordance with certain embodiments of the present invention,an acyl amino acid is prepared through use of an engineered peptidesynthetase that comprises a first peptide synthetase domain found in apeptide synthetase complex that synthesizes a lipopeptide, and isengineered in that it is separated from at least some other domainsfound in the peptide synthetase complex.

Fatty acids utilized by naturally-occurring peptide synthetases can beβ-hydroxy fatty acids (e.g., as found in surfactin and other β-hydroxylipo-peptides described in Table 1). In other cases, utilized fattyacids are a β-amino fatty acid (for example, Iturin; see Table 1). Incertain instances, utilized fatty acids are unmodified at the β-position(e.g., as in daptomycin and certain other lipo-peptides described inTable 1).

As described herein, the present invention encompasses the appreciationthat, for all three types of fatty acids utilized by peptide synthetasesthat synthesize lipopeptides, the the first protein domain of the firstmodule of the relevant peptide synthetase complex typically plays acritical role in lipo-initiation. However, the precise mechanism oflipo-initiation differs for each of the three types of fatty acid. Ingeneral terms, the first modules of a peptide synthetase enzyme, whichnaturally creates a lipo-peptide, has a particular organization. Eachmodule begins with a condensation domain that is required for thelipo-initiation reaction. The condensation domain is followed by anadenylation domain, which is followed by a thiolation domain (also knownas a peptidyl carrier protein (PCP) domain). The adenylation domainselects the 1st amino acid that will be incorporated into thelipo-peptide and creates an amino acid adenylate. Subsequent toadenylation, the amino acid becomes tethered to the enzyme via linkageto a phosphopantethione moiety, which is attached to the thiolationdomain. The chemical reaction that results in tethering of the aminoacid releases AMP as a byproduct.

For synthetases that attach a β-hydroxy fatty acid to the bound aminoacid, the condensation domain of the first module utilizes β-hydroxyfatty acid CoA as a substrate, and transfers the fatty acid to theN-terminus of the amino acid substrate, which is tethered to thethiolation domain. No enzyme activity, other than the activity of theC-domain itself, is required for this particular reaction, although ithas been reported that the srfD protein stimulates the lipo-initiationreaction (see Steller et al., which was cited in U.S. Pat. No.7,981,685) (Ref 5).

For synthetases that attach a β-amino group to the fatty acid, thecondensation domain has several sub-domains, each of which has aparticular function (see FIG. 6 of Duitman et al.) (Ref 6). Consideringthe iturin synthetase as a specific examples (also known as themycosubtilin synthetase), the mechanism of lipo-initiation is thefollowing (see Hansen et al., (Ref 7) and Aron et al., (Ref 8) fordetails): the acyl ligase domain adenylates a long-chain fatty acid (inthis case myristic) and the fatty acid is then transferred to anenzyme-linked 4-phosphopantetheine and AMP is released, in a separatereaction, the fenF gene product catalyzes the transfer of malonate (frommanonyl-CoA) to a second acyl carrier domain (located within module 1).The β-ketoacyl synthetase domain catalyzes the condensation of themalonyl and acyl thioesters, creating a β-keto thioester, the B-ketothioester is converted into a β-amino fatty acid by a transaminasedomain homologous to amino acid transferases, the β amino fatty acid istransferred to a thiolation domain and is then joined to the substrateamino acid (in this case asparagine), which was previously linked to theenzyme via the action of the module 1 adenylation domain. This series ofreactions results in the joining of a beta-amino fatty acid to an aminoacid.

For synthetases that attach fatty acids that are unmodified at theβ-position, the condensation domain of the 1 st module catalyzes thetransfer of the fatty acid to the N-terminus of the amino acidsubstrate, which is tethered to the thiolation domain. Considering thedaptomycin synthetase as an example, two additional proteins areinvolved: an acyl-CoA ligase (DptE) (sequence listing GenBank:AAX31555.1) and an acyl carrier protein (DptF) (sequence listingGenBank: AAX31556.1). DptE activates the fatty acid substrate by linkingit to CoA, and the activated fatty acid is then transferred to DptF, andsubsequently transferred to the enzyme-bound amino acid substrate (seeWittmann et al.) (Ref 9). Note that studies conducted in vitro haveconfirmed that DptE transfers the fatty acid to DptF, but experimentsaimed at demonstrating the involvement of the condensation domain insubsequent transfer of the fatty acid from DptF to the amino acidsubstrate appears not to have been reported in the literature.

Phylogenetic analysis of peptide synthetase condensation domains isdescribed in Roongsawang et al. (Ref 2), and in Rausch et al. (Ref 3).Those of ordinary skill in the art, guided by the present disclosure,and optionally in consultation with such references, can readilyidentify, select, and/or engineer appropriate peptide synthetasecondensation domains for use in designing, constructing, producing,and/or otherwise providing engineered peptide synthetases for productionof acyl amino acids in accordance with the present invention.

Non-limiting examples of peptide synthetase complexes that may containpeptide synthetase domains useful in the identification, selection,design, and/or production of engineered peptide synthetases as describedherein include, for example, surfactin synthetase, fengycin synthetase,arthrofactin synthetase, lichenysin synthetase, syringomycin synthetase,syringopeptin synthetase, saframycin synthetase, gramicidin synthetase,cyclosporin synthetase, tyrocidin synthetase, mycobacillin synthetase,polymyxin synthetase, bacitracin synthetase, and combinations thereof.

Thus, the present invention provides engineered peptide synthetases,which in some embodiments comprise or consist of isolated peptidesynthetase domains from reference peptide synthetase complexes thatsynthesize lipopeptides. In some embodiments, such reference peptidesynthetase complexes are known peptide synthetase complexes. In someembodiments, such reference peptide synthetase complexes are naturallyoccurring peptide synthetase complexes. In some embodiments, providedengineered peptide synthetases comprise or consist of a single peptidesynthetase domain. In some embodiments, provided engineered peptidesynthetases comprises or consist of a first peptide synthetase domainfrom a peptide synthetase complex that synthesizes a lipopeptide.

In some embodiments, an engineered peptide synthetase, peptidesynthetase domain, or component thereof (e.g., adenylation (A) domain,thiolation (T) domain, and/or condensation (C) domain) may contain oneor more sequence modifications as compared with a reference peptidesynthetase, domain, or component. Typically, however, an engineeredpeptide synthetase, peptide synthetase domain, or component thereofshows a high overall degree of sequence identity and/or homology withits reference peptide synthetase, domain, or component.

In some embodiments, an engineered peptide synthetase, peptidesynthetase domain, or component thereof contains insertions, deletions,substitutions or inversions of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or more amino acids ascompared to its relevant reference.

In certain embodiments, such amino acid substitutions result in anengineered polypeptide that comprises an amino acid whose side chaincontains a structurally similar side chain as compared to thecorresponding amino acid in the relevant reference. For example, aminoacids with aliphatic side chains, including glycine, alanine, valine,leucine, and isoleucine, may be substituted for each other; amino acidshaving aliphatic-hydroxyl side chains, including serine and threonine,may be substituted for each other; amino acids having amide-containingside chains, including asparagine and glutamine, may be substituted foreach other; amino acids having aromatic side chains, includingphenylalanine, tyrosine, and tryptophan, may be substituted for eachother; amino acids having basic side chains, including lysine, arginine,and histidine, may be substituted for each other; and amino acids havingsulfur-containing side chains, including cysteine and methionine, may besubstituted for each other.

In certain embodiments, amino acid substitutions result in an engineeredpolypeptide that comprises an amino acid whose side chain exhibitssimilar chemical properties to a corresponding amino acid present in arelevant reference. For example, in certain embodiments, amino acidsthat comprise hydrophobic side chains may be substituted for each other.In some embodiments, amino acids may be substituted for each other iftheir side chains are of similar molecular weight or bulk. For example,an amino acid in an engineered domain may be substituted for an aminoacid present in the relevant reference if its side chains exhibits aminimum/maximum molecular weight or takes up a minimum/maximum amount ofspace.

In certain embodiments, an engineered polypeptide shows at least about50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, or99% homology or identity with a relevant reference (e.g., over a portionthat spans at least 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 ormore amino acids).

In certain embodiments, engineered polypeptides of the present inventioncomprise two or more polypeptide domains that occur in one or morenaturally occurring or other known reference polypeptides, but that arei) separated from one or more sequence elements with which they areassociated in the reference polypeptide; ii) associated with one or moresequence elements with which they are not associated in the referencepolypeptide(s); and/or iii) associated in a different way (e.g., in adifferent order or via a different linkage) with one or more sequenceelements with which they are associated in the reference polypeptide. Asa non-limiting example, two naturally occurring polypeptide domains thatare directly covalently linked in nature may be separated in anengineered polypeptide by one or more intervening amino acid residues.Additionally or alternatively, two naturally occurring polypeptidedomains that are indirectly covalently linked in nature may be directlycovalently linked in an engineered polypeptide, e.g. by removing one ormore intervening amino acid residues.

In certain embodiments, two naturally occurring peptide domains that arefrom different peptide synthetases are covalently joined to generate anengineered polypeptide of the present invention.

In some embodiments, engineered peptide synthetases provided by and/orfor use in accordance with the present invention do not includethioesterase and/or reductase domains. Such domains are known tofunction in the release of peptides and lipopeptides from thenonribosomal peptide synthetase complexes that produce them. In oneaspect, the present invention provides the surprising finding that,notwithstanding their central role in release of lipopeptides frompeptide synthetase complexes, such domains are often not required forrelease of acyl amino acids from engineered peptide synthetases asdescribed herein. This thioesterase and/or reductase domains mayoptionally be included in some embodiments of the present invention, butare specifically excluded in some embodiments.

In certain embodiments, compositions and methods of the presentinvention are useful in large-scale production of acyl amino acids. Incertain embodiments, acyl amino acids are produced in commerciallyviable quantities using compositions and methods of the presentinvention. For example, engineered polypeptides of the present inventionmay be used to produce acyl amino acids to a level of at least 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70,80, 90, 100, 110, 120, 130, 140, 150, 150, 200, 250, 300, 400, 500, 600,700, 800, 900, 1000 mg/L or higher. As will be appreciated by thoseskilled in the art, biological production of acyl amino acids usingengineered polypeptides of the present invention achieves certainadvantages over other methods of producing acyl amino acids. Forexample, as compared to chemical production methods, production of acylamino acids using compositions and methods of the present inventionutilizes more readily available and starting materials that are easierto store, reduces the necessity of using harsh and sometimes dangerouschemical reagents in the manufacturing process, reduces the difficultyand efficiency of the synthesis itself by utilizing host cells asbioreactors, and reduces the fiscal and environmental cost of disposingof chemical by-products. Other advantages will be clear to practitionerswho utilize compositions and methods of the present invention.

Acyl Amino Acids and Compositions

The present invention provides compositions comprising acyl amino acidsproduced by engineered peptide synthetases as described herein. In someembodiments, such compositions comprise a collection of individual acylamino acid molecules, that are related to one another in that they areeach synthesized by the same engineered peptide synthetase and togetherrepresent a distribution of chemical entities, varied in precisechemical structure (e.g., due to varying length and/or composition ofacyl chains, linkages within such acyl chains and/or between an acylchain and the amino acid, etc), that are synthesized by the relevantengineered peptide synthetase, under the conditions of synthesis (e.g.,in vivo or in vitro). In some embodiments, a provided compositionincludes straight-chain acyl moieties, branched acyl moieties, and/orcombinations thereof.

That is, it will be appreciated by those skilled in the art that, insome embodiments, one feature of engineered production of acyl aminoacids is that engineered peptide synthetases may not generate purepopulations of single chemical entities, particularly when acting invivo. Thus, as noted above, the present invention provides acyl aminoacid compositions comprising distributions of chemical entities. In someembodiments, the present invention provides acyl amino acid compositionsin which substantially all acyl amino acids comprise the same amino acidmoiety, but the composition includes a distribution of acyl moieties.

As described herein, the present invention provides a wide variety ofacyl amino acids and compositions. In some embodiments, the presentinvention provides acyl amino acids and compositions in which the aminoacid moiety is or comprises one found in an amino acid selected from thegroup consisting of alanine, arginine, asparagine, aspartic acid,cysteine, glutamic acid, glutamine, glycine, histidine, isoleucine,leucine, lysine, methionine, phenylalanine, proline, serine, threonine,tryptophan, tyrosine, and/or valine. Alternatively or additionally, insome embodiments, the present invention provides acyl amino acids andcompositions in which the amino acid moiety is or comprises one found inan amino acid selected from the group consisting of selenocysteineand/or pyrrolysine. In some embodiments, the present invention providesacyl amino acids and compositions in which the amino acid moiety is orcomprises one found in an amino acid selected from the group consistingof norleucine, beta-alanine and/or ornithine. In some embodiments, thepresent invention provides acyl amino acids and compositions in whichthe amino acid moiety is or comprises one found in an amino acidselected from the group consisting of L-amino acids. In someembodiments, the present invention provides acyl amino acids andcompositions in which the amino acid moiety is or comprises one found inan amino acid selected from the group consisting of D-amino acids. Insome embodiments, the present invention provides acyl amino acids andcompositions in which the amino acid moiety is or comprises or comprisesone found in an amino acid D-glu or D-diaminopropionic acid. Thoseskilled in the art will be aware of appropriate amino acid substrates,usable by peptide synthetases as described herein (and particularly byengineered peptide synthetases as described herein) to generate acylamino acids containing such amino acid moieties. In some embodiments,the amino acid substrate is or comprises the recited amino acid. In someembodiments, the present invention provides acyl amino acids andcompositions in which the acyl group is found in a saturated fatty acidsuch as butyric acid, caproic acid, caprylic acid, capric acid, lauricacid, myristic acid, palmitic acid, stearic arachidic acid, behenicacid, and/or lignoceric acid. the present invention provides acyl aminoacids and compositions in which the acyl group is found in anunsaturated fatty acids such as, without limitation, myristoleic acid,palmitoleic acid, oliec acid, linoleic acid, alpha-linolenic acid,arachidonic acid, eicosapentaenoic acid, erucic acid, and/ordocosahexaenoic acid. Other saturated and unsaturated fatty acids whoseacyl moieties may be used in accordance with the present invention willbe known to those of ordinary skill in the art. In certain embodiments,acyl amino acids and compositions provided by present invention comprisebeta-hydroxy fatty acids as the fatty acid moiety. As is understood bythose of ordinary skill in the art, beta-hydroxy fatty acids comprise ahydroxy group attached to the third carbon of the fatty acid chain, thefirst carbon being the carbon of the carboxylate group.

In some embodiments, the present invention provides acyl amino acids andcompositions in which the acyl group comprises or consists of fatty acidchains with a length within a range bounded by a shorter length selectedfrom the group consisting of C2, C3, C4, C5, C6, C7, C8, C9, C10, C11,C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25,C26, C27, C28, C29, C30, and an upper length selected from the groupconsisting of C30, C29, C28, C27, C26, C25, C24, C23, C22, C21, C20,C19, C18, C17, C16, C15, C14, C13, C12, C11, C10, C9, C8, C7, C6, C5,C4, C3, C2, and C1, wherein the upper length is the same as or largerthan the lower length. In some particular embodiments, the presentinvention provides acyl amino acids and compositions in which the acylgroup comprises or consists of C10-C14 fatty acid chains, C13-16 fattyacid chains, C13-15 fatty acid chains, C16-24 fatty acid chains, C18-22fatty acid chains, C18-24 fatty acid chains, C8-C16 fatty acid chains.In some embodiments, the present invention provides acyl amino acids andcompositions in which the acyl group comprises, consists predominantlyof, or consists of C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15,C16, C17, C18, C19, and/or C20 fatty acid chains. In some embodiments,the present invention provides acyl amino acids and compositions inwhich the acyl group comprises, consists predominantly of, or consistsof comprises, consists predominantly of, or consists of C8, C9, C10,C11, C12, C13, C14, C15, and/or C16 fatty acid chains. In someembodiments, the present invention provides acyl amino acids andcompositions in which the acyl group comprises, consists predominantlyof, or consists of comprises, consists predominantly of, or consists ofC12, C13, C14, C15, and/or C16 fatty acid chains.

In some embodiments, the present invention provides acyl amino acidcompositions in which all acyl amino acids comprise the same amino acidmoiety (or comprise an amino acid moiety from the same amino acid.

In some embodiments, the present invention provides acyl amino acidcompositions in which different acyl amino acids within the compositionhave different acyl moieties (e.g., acyl moieties that differ, incomposition, structure, branching, and/or length (of one or morechains). In some embodiments, such compositions predominantly includeacyl moieties of a length (or within a range of lengths) as set forthabove. In some such embodiments, such predominant acyl moieties arepresent in the composition at a level of at least 50%, 51%, 52%, 53%,54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%,68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%<98%, or 99%. The Figures and Examples herein depict and/ordescribe certain particular acyl amino acids and/or acyl amino acidcompositions that are provided by and can be prepared in accordance withcertain embodiments of the present invention. To give but a fewparticular examples, the present invention specifically exemplifiesand/or otherwise provides certain acyl amino acids and/or acyl aminoacid compositions comprising, consisting predominantly of, or consistingof 2,4 diaminobutyric acid, (2S)-2,3-diaminobutyric acid, 2,3-diaminoproprionic acid, β-hydroxy myristoyl glutamate, β-hydroxymyristoyl diaminopropionic acid, betaines, cocyl glycinate, glycinelaureate, glutamine laureate, etc. For example, in some particularembodiments, the present invention provides acyl amino acid compositionsin which the amino acid moiety within acyl amino acids in thecomposition is from glycine or glutamate, and the fatty acid moiety ispredominantly a C12 fatty acid (i.e.g, is from lauric acid); in somesuch embodiments, all acyl amino acids in the composition have the sameamino acid moiety.

Host Cells

Engineered polypeptides of the present invention may be introduced inany of a variety of host cells for the production of acyl amino acids.As will be understood by those skilled in the art, engineeredpolypeptides will typically be introduced into a host cell in anexpression vector. So long as a host cell is capable of receiving andpropagating such an expression vector, and is capable of expressing theengineered polypeptide, such a host cell is encompassed by the presentinvention. An engineered polypeptide of the present invention may betransiently or stably introduced into a host cell of interest. Forexample, an engineered polypeptide of the present invention may bestably introduced by integrating the engineered polypeptide into thechromosome of the host cell. Additionally or alternatively, anengineered polypeptide of the present invention may be transientlyintroduced by introducing a vector comprising the engineered polypeptideinto a host cell, which vector is not integrated into the genome of thehost cell, but is nevertheless propagated by the host cell.

In certain embodiments, a host cell is a bacterium. Non-limitingexamples of bacteria that are useful as host cells of the presentinvention include bacteria of the genera Escherichia, Streptococcus,Bacillus, and a variety of other genera known to those skilled in theart. In certain embodiments, an engineered polypeptide of the presentinvention is introduced into a host cell of the species Bacillussubtilis.

Bacterial host cells of the present invention may be wild type.Alternatively, bacterial host cells of the present invention maycomprise one or more genetic changes as compared to wild type species.In certain embodiments, such genetic changes are beneficial to theproduction of acyl amino acids in the bacterial host. For example, suchgenetic changes may result in increased yield or purity of the acylamino acid, and/or may endow the bacterial host cell with variousadvantages useful in the production of acyl amino acids (e.g., increasedviability, ability to utilize alternative energy sources, etc.).

In certain embodiments, the host cell is a plant cell. Those skilled inthe art are aware of standard techniques for introducing an engineeredpolypeptide of the present invention into a plant cell of interest suchas, without limitation, gold bombardment and agrobacteriumtransformation. In certain embodiments, the present invention provides atransgenic plant that comprises an engineered polypeptide that producesan acyl amino acid of interest. Any of a variety of plants species maybe made transgenic by introduction of an engineered polypeptide of thepresent invention, such that the engineered polypeptide is expressed inthe plant and produces an acyl amino acid of interest. The engineeredpolypeptide of transgenic plants of the present invention may beexpressed systemically (e.g. in each tissue at all times) or only inlocalized tissues and/or during certain periods of time. Those skilledin the art will be aware of various promoters, enhancers, etc. that maybe employed to control when and where an engineered polypeptide isexpressed.

Insects, including insects that are threats to agriculture crops,produce acyl amino acids that are likely to be important or essentialfor insect physiology. For example, an enzyme related to peptidesynthetases produces the product of the Drosophila Ebony genes, whichproduct is important for proper pigmentation of the fly, but is alsoimportant for proper function of the nervous system (see e.g., Richardtet al., Ebony, a novel nonribosomal peptide synthetase for beta-alanineconjugation with biogenic amines in Drosophila, J. Biol. Chem.,278(42):41160-6, 2003). Acyl amino acids are also produced by certainLepidoptera species that are a threat to crops. Thus, compositions andmethods of the present invention may be used to produce transgenicplants that produce an acyl amino acid of interest that kills suchinsects or otherwise disrupts their adverse effects on crops. Forexample, an engineered polypeptide that produces an acyl amino acid thatis toxic to a given insect species may be introduced into a plant suchthat insects that infest such a plant are killed. Additionally oralternatively, an engineered polypeptide that produces an acyl aminoacid that disrupts an essential activity of the insect (e.g., feeding,mating, etc.) may be introduced into a plant such that the commerciallyadverse effects of insect infestation are minimized or eliminated. Incertain embodiments, an acyl amino acid of the present invention thatmitigates an insect's adverse effects on a plant is an acyl amino acidthat is naturally produced by such an insect. In certain embodiments, anacyl amino acid of the present invention that mitigates an insect'sadverse effects on a plant is a structural analog of an acyl amino acidthat is naturally produced by such an insect. Compositions and methodsof the present invention are extremely powerful in allowing theconstruction of engineered polypeptides that produce any of a variety ofacyl amino acids, which acyl amino acids can be used in controlling oreliminating harmful insect infestation of one or more plant species.

Producing Acyl Amino Acids and Compositions

Acyl amino acids and compositions may be produced by engineered peptidesynthetases as described herein. In some embodiments, acyl amino acidsare produced in vitro. In some embodiments, acyl amino acids areproduced in vivo, for example in host cells engineered to express anengineered peptide synthetase or component or domain thereof.

In some embodiments, acyl amino acids are produced in association withone or more components of a cell and/or with an engineered peptidesynthetase. In some embodiments, acyl amino acid compositions aresubjected to one or more isolation procedures, for example as is knownin the art, e.g., to separate produced acyl amino acid compounds fromone or more components of their production system (e.g., from anengineered peptide synthetase or component or domain thereof, and/orfrom one or more components of a cell such as an engineered cell.

EXEMPLIFICATION Example 1: Engineering Peptide Synthetases to ProduceAcyl Amino Acids with β-Hydroxy Amino Acids

In some embodiments of the present invention, an engineered peptidesynthetase that produces an acyl amino acid is designed and/or producedby isolating and/or otherwise engineering a known peptide synthetasedomain (e.g., by separating a first peptide synthetase domain that isfound in a peptide synthetase complex that synthesizes a lipopeptidefrom other elements, domains, or components of thelipopeptide-synthesizing complex) to produce the acyl amino acid.

For example, an acyl amino acid with a β-hydroxy fatty acid can becreated by expressing Module 1 of a synthetase, such as the srf(surfactin) synthetase in an appropriate host organism. Since Module 1of the srfAA (sequence listing srfAA module 1) is glutamate-specific,the expression of Module 1 in an appropriate host leads to theproduction of β-hydroxyl myristoyl glutamate.

The same approach can be used to link fatty acids to a variety ofdifferent amino acids since there are known (sequenced) “Module 1 DNAsegments”, which can be cloned from various natural systems, withadenylation domains specific for four distinct amino acids (Leu, Glu,Ser or Dhb; see Table). In addition, a variety of naturally occurringβ-hydroxy lipo-peptides (which are believed to be produced by peptidesynthetase enzymes) have been reported, for which the gene clusterencoding the synthetase responsible for their production has not beensequenced. A new β-hydroxy acyl amino acid can be produced by usingstandard molecule biology techniques to specifically identify “Module 1”of one of those synthetases (which belongs to the set “Module 1's” thathave not yet been sequenced) and expressing that Module 1 in anappropriate host. This approach would lead to the generation ofadditional new β-hydroxy acyl amino acids, including β-hydroxy acyl:Phe,D-Ala, 2,3-dehydro-2-aminobutyric acid, NMe-Ile, Gly, Thr andD-allo-threonine. The Table below summarizes various attributes of knownlipopeptides and the peptide synthetases that synthesize them in nature,including the amino acid acyl group and amino acid specificity of therelevant Module 1.

num- length of Reference for fatty name of reference for amino acidspecifcity gene encoding gene encoding gene encoding malonyl- berlipopeptide fatty acid chain acid information module 1 gene informationof the module adenylate-forming enzyme ACP CoA transacylase The “fattyacid adding” domain of these 18 synthetases adds β-hydroxy fatty acidsto the amino acid 1 amphisin C-10 A New arfA Cloning and Leu N/A N/A N/A(one form is Lipopeptide module1 Characterization of the arthrofactin)Biosurfactant Gene Cluster Encoding Produced by ArthrofactinArthrobacter sp. Synthetase from Strain MIS38 Pseudomonas (database 692)sp. MIS38 (database 691) 2 beauverolide C8 to Extraribosomal NDsynthetase genes have Phe N/A N/A N/A C-10 cyclic not been identifiedtetradepsipeptides beauverolides: profiling and modeling thefragmentation pathways (citation from PubMed) 3 callipeltin C-8Isolation of ND synthetase genes have D-Ala N/A N/A N/A callipeltins A-Cand not been identified of two new open-chain derivatives of callipeltinA from the marine sponge Latrunculia sp. A revision of thestereostructure of callipeltins (ref from Norine database) 4 corpeptinC-10 to Zampella A, Randazzo A, ND synthetase genes have 2,3-dehydro-2-N/A N/A N/A C-12 Borbone N, Luciani S, not been identified aminobutyricacid Trevisi L, Debitus C, D Auria MV, Tetrahedron Letters, 2002, 43(35), pp. 6163-6166 5 fengycin C-14 to Application of fenC1 Functionaland Glu N/A N/A N/A C-18 electrospray Transcriptional ionization massAnalyses of a spectrometry in Fengycin Synthetase rapid typing of Gene,fenC, fengycin homologues from Bacillus produced by subtilis Bacillussubtilis 6 fuscopeptin C-8 to Structure of ND synthetase genes have2,3-dehydro-2- N/A N/A N/A C-10 fuscopeptins, not been identifiedaminobutyric acid phytotoxic metabolites of Pseudomonas fuscovaginae. 7kulomo 2-hydroxyis More Peptides ND synthetase genes have NMe-Ile N/AN/A N/A opunalide ovaleric and and Other Diverse not been identifiedC8:01 Constituents of (7)-Me(2) OH(3) the Marine and 2-hydroxyis Molluskovaleric Philinopsis speciosa 8 lichenysin C-15 Structural and licAMolecular and Glu N/A N/A N/A Immunological module BiochemicalCharacterization 1 Characterization of a Biosurfactant of the ProteinProduced by Template Bacillus Controlling licheniformis JF-2Biosynthesis of the Lipopeptide Lichenysin 9 papuamdie C-11 PapuamidesA-D, ND synthetase genes have Gly N/A N/A N/A HIV-inhibitory and notbeen identified cytotoxic depsipeptides from the sponges Theonellamirabilis and Theonella swinhoei collected in Papua New Guinea, 10plusbacin C-14 to Structures of new ND synthetase genes have Thr N/A N/AN/A C-16 peptide not been identified antibiotics, plusbacins A1-A4 andB1-B4, 11 serrawettin C-10 A Novel Serratia D-Leu N/A N/A N/AExtracellular marcescens Cyclic Lipopeptide gene required Which Promotesfor surfactant Flagellum-Dependent serrawettin W1 and -Independentproduction encodes Spreading putative aminolipid Growth synthetase ofSerratia belonging to marcescens nonribosomal peptide synthetase family12 surfactin C13 to Separation and srfA Sequence and Glu N/A N/A N/A C15Characterization module analysis of the of Surfactin 1 genetic locusIsoforms responsible for Produced by surfactin Bacillus subtilissynthesis in OKB 105 Bacillus subtilis. I do not have a copy of thispaper. It is not in the database. 13 syringafactin C-10 toIdentification of a SyfA Identification of Leu N/A N/A N/A C-12biosynthetic gene module a biosynthetic cluster and the six 1 genecluster and associated the six lipopeptides associated involved inlipopeptides swarming motility involved in swarming of Pseudomonasmotility of syringae pv. Pseudomonas tomato DC3000. syringae pv. tomatoDC3000. 14 syringomycin C12 to The SyrE1 Characterization Ser N/A N/AN/A C14 structure of of the Syringomycin syringomycins Synthetase A1, Eand G Gene Cluster 15 syringopeptin C10 to Novel Cyclic SypA-M1 the sypasypb Dhb N/A N/A N/A C14 Lipodepsipeptide sypc synthetase from genesencod Pseudomonas twenty-two syringae pv. modules invovled lachrymansStrain nonribosomal peptide 508 and synthesis syringopeptinSyringopeptin pseudomonas Antimicrobial syringae Activities 16 tolaasinC8 and tolaasins A-E, five synthetase 2,3-dehydro-2- N/A N/A N/Aglutaric new genes aminobutyric acid (dhb) (pentadecanoic)lipodepsipeptides have not produced by been Pseudomonas identifiedtolaasii 17 tripropeptin C12 to tripropeptins, synthetase D-allo- N/AN/A N/A C17 novel genes threonine antimicrobial have not agents beenproduced by identified Lysobacter sp 18 Viscosin C10 to MassetolidesA-H, Massatolide A L-leu N/A N/A N/A C12 antimycobacterial biosynthesisin cyclic Pseudomonas depsipeptides fluorescens produced by twopseudomonads isolated from marine habitats The “fatty acid adding”domain of these 14 synthetases adds fatty acids to the amino acid (noβ-hydroxy) 19 A54145 C10 to A54145, a new the lipopeptide Trp IptEF notN/A C11 lipopeptide antibiotic identified antibiotic complex: A54145isolation and biosynthetic characteriztion gene cluster fromStreptomyces fradiae 20 apramide C8 to Apramides A-G synthetase geneshave Nme-Ala not not N/A C9 novel lipopeptides not been identifiedidentified identified from the marine cyanobacterium Lyngbya majuscula21 aquachelin C12 to Structure and synthetase genes have D-OH-Asp notnot N/A C14 membrane affinity not been identified identified identifiedof a suite of amphiphilic siderophores produced by a marine bacterium 22arylomycin C11 to Arylomycins A and synthetase genes have D-Nme-Ser notnot N/A C15 B, new not been identified identified identifiedbiaryl-lipopeptide antibiotics produced by Streptomyces sp. Tu 6075. IIStructure elucidation 23 CDA1b 2-epoxy- Structure Structure Ser ACSSC03249 N/A through hexanoic acid biosynthetic origin biosynthetic(acyl-CoA CDA4B and enggineered origin and synthetase) biosynthesis ofenggineered calcium-dependent biosynthesis of antibiotics fromcalcium-dependent Streptomyces antibiotics coelicolor from Streptomycescoelicolor 24 carmabin C10 Carmabins A and synthetase genes have NMe-Phenot not N/A B new not been identified identified identified lipopeptidesfrom the Caribean cyanobacterium Lyngbya majuscula 25 corrugatin C8Corrugatin A synthetase genes have OH-His not not N/A lipopeptide notbeen identified identified identified siderophore from Pseudomonascorrugata 26 daptomycin C10 to A21978C a Daptomycin Trp DptE DptF N/AC13 complex of new biosynthesis in acidic peptide Streptomycesantibiotics: roseosporus: isolation, cloning and chemistry, and analysisof the mass spectral gene cluster and structure revision of peptideelucidation stereochemistry 27 enduracidin C12 to The enduracidin AspOrf45 Orf35 N/A C13 biosynthetic gene cluster from Streptomycesfungicidicus 28 friulimicin C13 to Friulimicins: novel Sequencing andAsp or LipA LipD N/A C15 lipopeptide analysis of the Asn antibioticswith biosynthetic peptidoglycan gene cluster of synthesis thelipopeptide inhibiting activity antibiotic from Actinoplanes Friulimicinin friuliensis sp. nov. Actinoplanes II. Isolation and friuliensisstructural characterization 29 marinobactin C12 to Membrane affinitysynthetase genes have D-OH-Asp not not N/A C16 of the amphiphilic notbeen identified identified identified marinobactin siderophores 30polymyxin C7 to CONTRIBUTION Identification of 2,4 not not N/A C9 TO THEa Polymyxin diamino- identified identified ELUCIDATION Synthetasebutyric acid OF THE Gene Cluster of STRUCTURE OF Paenibacillus POLYMYXINB1 polymyxa and Heterologous Expression of the Gene in Bacillus subtilis31 putisolvin C6 Characterization Genetic and Leu not not N/A of twofunctional identified identified Pseudomonas characterization putidalipopeptide of the gene biosurfactants, cluster directing putisolvin Iand the biosynthesis II, which inhibit of putisolvin I biofilm formationand II in and breakdown Pseudomonas existing biofilms putida strainPCL1445 32 ramoplanin C8 to Studies on the Chemistry and Asn Ramo 26Ramo 11 N/A C10 biosynthesis of the biology of the lipodepsipeptideramoplanin antibiotic family of peptide Ramoplanin A2 antibiotics The“fatty acid adding” domain of this synthetase adds both p-hydroxy and“normal” (not β-hydroxy) fatty acids to the amino acid 33 AmphibactinC14 to Structure and synthetase genes have N-acetyl- not not N/A C18membrane affinity not been identified Hydroxy- identified identified ofa suite of Ornithine amphiphilic siderophores produced by a marinebacterium The “fatty acid adding” domain of this synthetase adds β-aminefatty acids to the amino acid 34 iturin C14 to Revised structure MycACloning, sequencing, and Asn N/A N/A fenF C17 of mycosubtilin,characterization a peptidolipid antibiotic of the iturin A operon fromBacillus subtilis

As is specifically described in Examples herein, additional newβ-hydroxy acyl amino acids can be produced by operationally linking acondensation domain, which specifies the addition of a β-hydroxy fattyacid, to an adenylation domain which specifies a particular desiredamino acid. In Example XXX, a condensation domain is operationallylinked to an adenylation domain that is specific for glycine and, uponexpression of the chimera in an appropriate host, β-hydroxy myristylglycine is produced. One who is skilled in the art will appreciate thatthis approach can be used to create any desired β-hydroxy acyl aminoacid, as long as an adenylation domain is available that is specific forthe desired amino acid.

Naturally occurring peptide synthetase modules are available thatspecify the use of each of the standard 20 amino acids, and in additionadenylation domains are available that are specific for about 300additional amino acids, or amino acid-like molecules (Kleinkauf et al)(Ref 10). This approach can be used to link a β-hydroxy fatty acid toany of these amino acids, or amino acid-like molecules.

Example 2: Engineered Peptide Synthetases Comprising or Consisting ofMycosubtilin Module 1 (MycA)

Strategies analogous to those described above can be used to link aβ-amino fatty acid to any desired amino acid. One approach is toidentify a naturally occurring “Module 1” (such as MycA of themycosubtilin synthetase, see Duitman et. al.) (Ref 6) and to express themodule in an appropriate host. In this specific example, the FenF geneis desirably also be expressed in the host (sequence listingAAF08794.1).

In general, a particular β-amino fatty acid can be produced in anappropriate host by expressing a module known to specify the joining ofa β-amino fatty acid to a particular amino acid, along with any gene orgenes that encode critical additional functions that are not naturallyfound in the host organism (such as for example FenF). Additional newβ-amino acyl amino acids can be produced by operationally linking acondensation domain, which specifies the addition of a β-amino fattyacid, to an adenylation domain which specifies a particular desiredamino acid. Again, and genes that encode additional required factors(such as homologs of FenF) can also be expressed in the host. Thisapproach can be used to link a β-amino fatty acid to any amino acid, aslong as an adenylation domain is available that is specific for thedesired amino acid.

Example 3: Engineered Peptide Synthetases Comprising or Consisting ofDaptomycin Synthetase Module 1

Strategies analogous to those described above can be used to link afatty acid (which is unmodified at the β-position) to any desired aminoacid. One approach is to identify a naturally occurring “Module 1” (suchas the Trp1 module of the daptomycin synthetase, see Miao et. al.) (Ref11) and to express the module in an appropriate host (Sequence listing:dptA1 module 1 of daptomycin synthetase). In addition, in this specificexample, the DptE and DptF genes should also be express in the host.

In general, a particular acyl amino acid (unmodified at the β-position)can be produced in an appropriate host by expressing a module known tospecify the joining of a fatty acid to a particular amino acid, alongwith any gene or genes that encode critical additional functions thatare not naturally found in the host organism (such as for example DptEand DptF). Additional new acyl amino acids can be produced byoperationally linking a condensation domain, which specifies theaddition of a fatty acid, to an adenylation domain which specifies aparticular desired amino acid. For example, fatty acid that isunmodified at the beta position can be attached to glycine using achimeric synthetase composed of the condensation domain of dptA1 module1 linked to that adenylation and thiolation domains of dptA1 module 5(which is specific for glycine) (sequence listing dptA1 Module 5)

Example 4: Additional Genes Useful or Necessary for Some Embodiments

For the Calcium-Dependent Antibiotic (CDA) system, it is believed thatspecific locus-associated fatty acid synthases produce the hexanoicacid, which is joined to the first amino acid of CDA; in particular, theACP (SCO3249), FabH4 (SCO3246), FabF3 (SCO3248) gene products arebelieved to be important for production of the hexanoic acid, which isthen joined to the amino acid substrate, in this case Ser (Ref 12).

Example 5: FA-Glu Compositions

In some embodiments, the distribution of fatty acids produced by atypical engineered strain that utilizes an engineered peptide synthetaseto synthesize FA-Glu is composed of fatty acids that all have aβ-hydroxyl but that have varying chain lengths. In some particularembodiments, the chain lengths vary in the following manner: C12, 1.6%;C13, 16.2%; C14, 55%; C15, 25.9%; C16, 1.2% and C17, 0.01%.

In some embodiments, some of the even numbered fatty acids are branchedand some are straight chain.

In some embodiments, none of the odd numbered fatty acids are straightchain (i.e, they are all branched). Odd numbered chains can be eitheriso or anteiso; in some embodiments, the present invention providesdifferent compositions with different relative amounts (e.g., ratios) ofthese forms. Branching nomenclature is well presented in FIG. 1 of Ref16. Fatty Acids of the Genus Bacillus: an example of branched-chainpreference, Toshi Kaneda, Bacteriological Review, 1977, Vol 41(2),391-418.

In some embodiments, for an engineered strain that produces FA-Glu withan engineered peptide synthetase, the fatty acid chain distributionchanges when particular keto acids are fed to the strain (see Table 1below). Dramatic changes in fatty acid chain distribution can begenerated when the enzyme that synthesizes the keto acids used toinitiate fatty acid synthesis in Bacillus is knocked out and single ketoacids are fed to the strain. In some embodiments, as the concentrationof the keto acid is changed, the pattern of fatty acid species isaltered.

In some embodiments, compositions are provided containing FA-Glu with95% C14 fatty acid by feeding 20 mM isobutyric to the mutant.

In some embodiments, feeding of low levels of keto acids that can onlybe used to produce branched fatty acids with odd number chains, isutilized to produce a population of fatty acids with about 80% (100 uM2-methylbutyric or 100 uM isovaleric) surfactant with C14 length fattyacid.

Significantly, since the mutant cannot synthesize its own keto acidstarters for even numbered branched chain fatty acid synthesis, feedingof low concentrations of either of these ketos acids (100 uM2-methylbutyric or 100 uM isovaleric) allows the production of apopulation of surfactant that is predominantly even numbered andstraight chain. Thus, the present invention surprising provides methodsand compositions for generating, and compositions comprising mostlystraight chain (rather than branched) fatty acid, produced by B.subtilis. Indeed, the present invention specifically describesstrategies for generating a Bacillus strain (and strains so generated)that exclusively produces straight chain fatty acid.

Example 6: Production of Amphoteric Surfactants

The present Example describes use of engineered peptide synthetases (inengineered host cells) to produce amphoteric surfactants with one regionor regions that harbor a negative charge and another region or regionsthat harbor a positive charge. Examples of amino acids that can be usedto produce such surfactants are shown below. The amino acids all havetwo amino groups and include: 2,4-diaminobutyric acid,(2S)-2,3-diaminobutyric acid, 2,3-diaminopropionic acid, ornithine andlysine.

One particular example of a surfactant of this sort is shown in FIG. 2,it is β-hydroxy myristoyl diaminopropionic acid.

This surfactant will be zwitterionic at physiological pH given that thepKa of the beta amine of 2,3 diaminoprionic acid is 9.57 and the pKa ofan alpha carboxyl is about 2.2. To generate this surfactant, acondensation domain capable of directing the linkage of β-hydroxyl fattyacid to an amino acid (such as the condensation domain of SRFAAmodule 1) (sequence listing srfAA Module 1) is linked to the adenylationand thiolation domain of a module that is specific for2,3-diaminopropionic acid (DAP). Felnagle et al., described a peptidesynthetase that incorporates DAP. The synthetase is found inSaccharothrix mutabilis subsp. capreolus ATCC 23892. The DAP-specificmodule is the second module of CmnA (Sequence listing CmaA, A2).

Bacillus subtilis 168 does not synthesize DAP. Two genes need to beadded to Bacillus in order to enable conversion of serine to DAP. Thegenes are described in references cited below. The genes are found inStaphylococcus aureus and other bacteria. The genes are called sbnA andsbnB. For example, the genes are present in Staphylococcus aureus strainJH9, and also in Staphylococcus aureus strain Mu50/ATCC 700699. The sbnAgene (sequence listing sbnA) is also known as SaurJH9 0103. The sbnBgene (sequence listing sbnB) is also known as SaurJH9 0104.

Homologues of the sbnA and sbnB genes can be used instead of, or inaddition to, sbnA and sbnB. For example, Bacillus cereus strains thatsynthesize zwittermicin encode homologues of sbnA and sbnB, called ZmaU(sequence listing ZmaU) and ZmaV (sequence listing ZmaV), respectively.

The charge of the primary amine of the surfactant shown in FIG. 2 willdepend on pH, and will be positive in the vicinity of pH 7.0. As the pHis elevated, the amine will lose a hydrogen and become neutral incharge. A surfactant with a positive charge that is independent of pHcan be produced by converting the surfactant shown above into a betaine(which harbors a quaternary ammonium group) see FIG. 3.

This can be done in vitro using a method described by Simon and Shokat.(see reference in reference list). 100 mg of (2-bromoethyl)trimethylammonium bromide are added to a microfuge tube. 1 mL of asolution of the fatty Acid-DAP (FA-DAP) surfactant is added to the tube.The mixture is shaken at 50° C. until the solid dissolves. Reactionproceeds for about 5 hours. To consume the remaining alkylating agent,the reaction is quenched with 50 μl 20mercaptoethanol and incubated atroom temperature for 30 minutes.

Alternatively or additionally, methylation can be accomplished in vivousing a methyltransferase. One of the symbols did not translate it'sshown as a box Bacterial □-N-methyltransferases have been described byZhang, et al. As example, genes encoding methyltransferases can beobtained from Bacillus subtilis (sequence listing Bacillus prmA) or E.coli (sequence listing E. coli prmA). A methyltransferase that modifiescypemycin can be used (sequencing listing cypemycin methyltransferase);the gene is found in Streptomyces sp. OH-4156. A gene encoding a similarprotein (76% identical) can be obtained from Streptomyces griseus subsp.griseus NBRC 13350 (sequencing listing Streptomyces griseusmethyltransferase).

Example 7: Production of Fatty Acids and Fatty Acid Derivatives withParticular Fatty Acid Branching Patterns

Naturally occurring fatty acids produced by living organisms typicallyhave two sorts of modifications that affect the melting temperature ofthe fatty acids and their derivatives. These modifications are branchingand desaturation (i.e., the presence of particular double bonds), andboth modifications lower the melting point of the fatty acid.

Certain organisms, including particular gram positive and gram negativebacteria, as well as typical eukaryotes such as yeast, control thefluidity of membranes by desaturation of fatty acids. The ability tointroduce desaturated fatty acids into membranes is important withregard to maintenance of membrane fluidity as temperature decreases.Certain bacteria, such as Bacillus subtilis, do not rely on desaturationto increase membrane fluidity. Instead, these bacteria control membranefluidity via the synthesis of branched fatty acids (for a list ofrepresentative bacterial genera that synthesize branched fatty acids,see Table 3 of Ref 13.).

Given the general need of organisms to control membrane fluidity,biologically produced oils typically contain branches, double bonds, orboth. From the perspective of commercial production of fatty acids andtheir derivatives, there is a need to control these branching anddesaturation reactions in order to produce fatty acids with particularcharacteristics that provide specific benefits to customers. Methods forcontrolling branching and desaturation are described below.

As background information, we will consider E. coli as an example of anorganism that synthesizes straight chain fatty acids (i.e., fatty acidsthat lack branching), fatty acid synthesis initiates when the enzymefadH (β-ketoacyl-ACP synthase III) catalyzes condensation ofacetyl-coenzyme A (acetyl CoA) with malonyl-acyl carrier protein(malonyl-ACP)(Ref 14). This condensation produces an acetoacetyl-ACPthat is then elongated by the iterative action of the E. coli fatty acidsynthesis machinery.

Initiation of fatty acid synthesis in Bacillus subtilis occurs by adifferent, but similar, mechanism. Bacillus subtilis encodes twoβ-ketoacyl-ACP synthase III enzymes (fadHA and fadHB). Although theseenzymes will utilize acetyl-CoA as a substrate, they prefer to usebranched substrates such as isobutyryl-CoA, 2-methylbutyryl-CoA andisovaleryl-CoA (REF 15). These CoA derivatives are produced from theamino acids L-valine, L-isoleucine and L-leucine, respectively (REF 16).

Initiation of fatty acid synthesis with a branched starter unit leads tothe syntheses of a terminally branched fatty acid. The precise chemicalcomposition of the branched starter impacts the length and specificbranching of the synthesized fatty acid. For example, initiation withisobutyrate in Bacillus leads to production of “iso” fatty acids witheven number lengths, such as 14 carbons (C14) and 16 carbons (C16).Initiation with 2-methyl butyrate leads to synthesis of odd numbered“anteiso” fatty acids (e.g., C15 and C17). Initiation with isovalerateleads to synthesis of odd numbered “iso” fatty acids (e.g., C15 andC17).

The enzymatic activity responsible for conversion of particular aminoacids (L-valine, L-isoleucine and L-leucine) to their respective ketoacids is α-keto acid dehydrogenase. Mutant Bacillus cells that lackα-keto acid dehydrogenase activity require the addition of at least oneketo acid for growth (isobutyrate, 2-methyl butyrate or isovalerate)(Ref 17). Feeding a specific keto acid to a strain that lacks β-ketoacid dehydrogenase activity not only rescues the growth deficiency ofthe mutant strain but also specifically affects the fatty acidcomposition of the cells. For example, feeding isobutyrate to the mutantleads to the exclusive synthesis of fatty acids with even numbered chainlength. These fatty acid chains include fatty acids derived from theisobutyrate starter (i14:0, 33%; i16:0, 51%) and also straight chainfatty acids produced using de novo synthesized acetate as a starter(14:0, 2%; 16:0, 13%) (see Ref 17). Furthermore, note that the oddnumbered fatty acids are eliminated when a strain that lacks β-keto aciddehydrogenase activity is fed isobutyrate (but not fed 2-methyl butyrateand/or isovalerate).

Feeding of 2-methyl butyrate leads to the production of a15:0, 51% anda17:0, 39%, with some straight chain even numbered fatty acid stillproduced via utilization of de novo produced acetate (14:0, 2%; 16:0,8%) (Ref 17).

Feeding of isovalerate leads to the following pattern: i15:0, 56%;a15:0, 7%; i17:0, 12%; a17:0, 2%; 14:0, 3% and 16:0, 16%). The presenceof anteiso fatty acids is unexpected and suggests that the isovalerateused in the study was contaminated with a keto acid such as 2-methylbutyrate. The straight chain even numbered fatty acids are producedutilizing de novo produced acetate (these data are taken from Ref 17).

There is a commercial need to produce fatty acids and fatty acidderivatives with precise lengths and branching. In Examples herein, wedescribe methods for producing particular populations of fatty acids andfatty acid derivatives, such as acyl amino acid surfactants.

In addition to specifically controlling the branching of fatty acids inorganisms such as Bacillus, it is advantageous in certain cases toeliminate branching in organisms such as Bacillus, for example in orderto produce surfactants with straight chain rather than branched fattyacid tails. This can be accomplished by expressing a β-ketoacyl-ACPsynthase III enzyme in Bacillus that prefers to use straight chainstarts, such as acetyl CoA. As an example of this, Li and coworkersconverted a strain of Streptomyces coelicolor (which typicallypredominantly synthesizes branched fatty acids) into a strain thatsynthesizes 86% straight chain fatty acids by replacing the endogenousβ-ketoacyl-ACP synthase III enzyme with E. coli fabH (Ref 18). A generalmethod can be followed to identify enzymes that function in a manneranalogous to E. coli fadH, that is they initiate fatty acid synthesisusing predominantly straight chain starter units, such as acetyl CoA,which will result in the synthesis of straight chain fatty acids.

Methods such as gas-liquid-chromatography can be used to determinewhether an organism synthesizes straight chain fatty acids, or insteadsynthesizes a mixture of straight chain and branched fatty acids. Forexample, Kaneda (Ref 16) used gas-liquid-chromatography to characterizethe fatty acids of sixteen species of Bacillus, and found that allsixteen species synthesized a mixture of straight chain and branchedfatty acids. In contrast, a similar study reported by Kaneda and Smith(Ref 19) showed that certain bacteria and yeasts exclusively synthesizestraight chain fatty acids, and indeed it is true that most organismssynthesize exclusively straight chain fatty acids. Kaneda and Smithreported that the bacteria E. coli and Pseudomonas fluorescensexclusively synthesize straight chain fatty acids. Other examples oforganisms that exclusively synthesize straight chain fatty acids arereported in Ref 20 and include various Streptococcus and Enterococcusspecies, and other species.

Once an organism has been identified that exclusively synthesizesstraight chain fatty acids, assuming the genome of the organism has beensequenced, comparative sequence analysis can be used to determinewhether the organism encodes a protein similar to E. coli fabH. Forexample, the gene encoding the Streptococcus pneumonia fabH homologue is39% identical to E. coli fabH. The Streptococcus fabH has been clonedand, when the enzyme was produced and studied in vitro, it was found toprefer to utilize short straight CoA primers and to synthesize straightchain fatty acids (Ref 21)(SEE SEQUENCE LISTING AF384041).

In certain instances, an organism that exclusively or predominantlysynthesizes straight chain fatty acids will encode an enzyme that isfunctionally equivalent to E. coli fabH, but that does not have homologyto fabH. As an example, the Pseudomonas aeruginosa PA5174 gene encodes afabY enzyme that is not homologous to fadH, but serves the same functionand prefers to use acetyl CoA as the starter for fatty acid synthesis(see this Ref 22 Fatty Acid Biosynthesis in Pseudomonas aeruginosa isinitiated by the FabY Class of -Ketoacyl Acyl Carrier ProteinSynthases). Genes homologus to PA5174, that can be used for thispurpose, include the following genes and their homologues—see Sequencelisting: Pmen_0396, MDS_0454, Psefu_4068, Avin_05510, PSPA7_5914,PLES_55661 and PA14_68360.

In order to convert a strain that produces branched fatty acids (such asBacillus subtilis) into a strain the produces predominantly orexclusively straight chain fatty acids a gene such as E. coli fabH orPseudomonas aeruginosa PA5174 is introduced into the strain such that itis expressed at the correct time and level. In the specific case ofBacillus subtilis, to ensure that the heterologous enzyme, which prefersstraight chain starters, is expressed at the correct time and at thecorrect level, it is advantageous to place the heterologous gene thatencodes the β-ketoacyl-ACP synthase III enzyme under the control of thepromoter that that usually controls the expression of Bacillus fadHA(the fadhA promoter, see sequence listing “fabhA promoter”).

Once the heterologous β-ketoacyl-ACP synthase III enzyme is beingexpressed in Bacillus, branched fatty acid synthesis can be furtherreduced by reducing, altering or eliminating β-keto acid dehydrogenaseactivity. In addition, the level of branched fatty acid can be reducedby reducing, altering or eliminating the activity of the endogenousBacillus fadHA and/or fadHB genes (also known as fadH1 or fadH2).

When engineered strains are developed with lower levels of branchedfatty acids, it is advantageous to express a desaturase enzyme inBacillus in order to introduce sufficient double bonds into a subset ofthe Bacillus fatty acids to enable the Bacillus to maintain membranefluidity. Examples of deasturases that can be used include 9-fatty aciddesaturase from Psychrobacter urativorans (Ref 23)(sequence listingEF617339) and the 9-fatty acid desaturase from Mortierella alpine (Ref24) (sequence listing AB015611).

Alternatively or additionally, genetic changes can be made that resultin the constitutive expression of the endogenous Bacillus desaturase,des (Ref 25) (sequence listing AF037430). For example, constitutive desexpression can be enabled via deletion of desk (Seq listing DesKgen)(Ref 26). It has been demonstrated that strains with a lipA (yutB)knockout are not able to synthesize fatty acids and require both ketoacids and acetate for growth Ref 26. Constitutive expression of des wasachieved by knocking out desK, which leads to overexpression of thetranscriptional activator DesR, resulting in constitutive expression ofdes. Overexpression of des led to desaturation of about 13% of theBacillus fatty acids and eliminated the keto acid requirement,indicating that the growth defect caused by an inability to producebranched fatty acids can be overcome by desaturation of a certainpopulation of Bacillus fatty acids.

An alternative strategy to produce acyl amino acid surfactants withstraight chain fatty acids is to express the peptide synthetase enzymethat produces the acyl amino acid in a strain that does not producebranched fatty acids, such as E. coli. It has been reported that thesrfA operon required for production of surfactin has been cloned andexpressed in E. coli (Ref 27). However, the lipopoetide was notcharacterized directly, rather the authors report that the engineeredstrain produces a new hydrophobic compound, which was analyzed by TCLusing surfactin as a control. Surfactin's Rf value was 0.63 and the newhydrophobic compound showed an Rf value of 0.52. The authors did notspeculate on why the Rf values differed.

An acyl amino acid with a straight chain fatty acid can be produced bycloning a gene that encodes a peptide synthetase enzyme capable ofdirecting the synthesis of an acyl amino acid (such as Module 1 ofsrfAA) into an E. coli plasmid under the control of a promoter such asthe T7 promoter and introducing the cloned gene into E. coli. It is alsonecessary to clone and express a gene such as Bacillus sfp, which is aphosphpantetheinyl transferase needed to modify peptide synthetase,enzymes in order to functionally activate those enzymes (see Ref 28).The amount of surfactant produced, and the length of the fatty acidtails present on the population of surfactant molecules, can bedetermine using LCMS as described in Ref 29.

Once a strain is generated that produces a desired acyl amino acid, thestrain can be further modified to increase the yield of the acyl aminoacid. One strategy for increasing yield is to inactivate (e.g., delete)genes that limit production of the acyl amino acid. Once genes areidentified that, when deleted, increase yield of an acyl amino acid, astrain harboring multiple such deletions can be generated. In addition,genes that either do not affect surfactant yield, or that negativelyaffect surfactant yield, can be replaced with genes that stimulate acylamino acid production. Examples herein describe genes that, whendeleted, increase yield of an acyl glutamate surfactant referred to asFA-Glu.

Example 8: Production of β-Hydroxy Myristoyl Glycinate by Fermentation

As described in U.S. Pat. No. 7,981,685, Modular Genetics, Inc.(Modular) has shown that an engineered peptide synthetase enzyme can beused to produce an acyl amino acid (β-hydroxy myristoyl glutamate). Thisapproach has been expanded to produce β-hydroxy myristoyl glycinate.Here is the detailed information on production of β-hydroxy myristoylglycinate.

Engineering of a FA-GLY-TE Construct Using a Fusion Between DNA encodingthe condensation domain of srfAA module 1 and DNA encoding theadenylation domain of Module 2 of Linear Gramicidin.

In this Example, we amplified the genomic DNA fromOKB105Δ(upp)SpectRFA-GLU-TE-MG that encodes for the genes responsiblefor FA-GLU production, and this region was amplified using primers35664-C4:5′-TTGTACTGAGAGTGCACCATAtATCGACAAAAATGTCATGAAAGAATCG-3′ (SEQ IDNO: 3) and 35664-D4:5′-ACGCCAAGCTTGCATGCCtTTATGAAACCGTTACGGTTTGTGTATT-3′(SEQ ID NO: 4). This fragment was annealed to the PCR product obtainedfrom the template pUC19 and primers35664-B4:5′-AGGCATGCAAGCTTGGCGtAATCATGGTCATAGCTGTTTCCTGTG-3′ (SEQ ID NO:5)_and 35664-A4:5′-ATATGGTGCACTCTCAGTACAaTCTGCTCTGATGCCGCATAGTT-3′ (SEQID NO: 6). The annealed mixture was transformed into SURE cells toproduce the plasmid Psrf-Glu-TE-pUC19.

Psrf-Glu-TE-pUC19 was used as a template to engineer a variant of thisplasmid that contained a fusion of the condensation domain of srfAAmodule 1 to the adenylation domain of Module 2 of Linear Gramicidin(which adenylation domain is specific for the amino acid glycine),followed by the TE.

The DNA sequence corresponding to Module 2 of Linear Gramicidin wasamplified from genomic DNA of strain Bacillus brevis (ATCC 8185) usingprimers 35664-G4:5′-GCTTGCTTGCGGAGCAGATCA-3′ (SEQ ID NO: 7) and35664-H4:5′-TCGAATCTCCGCCCAGTTCGA-3′ (SEQ ID NO: 8). The resulting PCRwas used as a template for primers35664-H2:5′-CACTGATTTCTGATGCGGAgAAACGCGATTTGTTTTTGCGG-3′ (SEQ ID NO: 9)and 35664-F2:5′-CTCCGAGCGCAAAGAAATcGTCGCGAATCCCGATCCG-3′ (SEQ ID NO:10).

This fragment was annealed to the PCR product obtained from the templatePsrf-Glu-TE-pUC19 using primers35664-C7:5′-GATTTCTTTGCGCTCGGAgGGCATTCCTTGAAGGCCATGA-3′ (SEQ ID NO: 11)and 35664-E7:5′-CTCCGCATCAGAAATCAGTgTTAATTCATCAATTGTATGTTCTGGATGC-3′(SEQ ID NO: 12). The annealed mixture was transformed into SURE cells toproduce the plasmid Psrf-Gly-lgr_m2-F3-TE-pUC19. This plasmid was usedto transform 23844-dl OKB105Δ(upp)SpectR(Δ mod(2-7))upp+KanR. Theresulting strain was named OKB105Δ(upp)SpectRFA-GLY-TE.

One strain derived from this strategy, which had the correct sequence toproduce FA-GLY, was named 37237-d3. Analysis of the production of FA-GLYby strain OKB105Δ(upp)SpectRFA-GLY-TE shows that the strain was able toproduce detectable amounts of FA-GLY. Data was obtained using LC-MSanalysis. MS-MS analysis of the material derived fromOKB105Δ(upp)SpectRFA-GLY-TE revealed that the product was indeed FA-GLY.(sequence listing Psrf-Gly-lgr_m2-F3-TE-pUC19).

See FIG. 5 for an LCMS analysis of FA-Gly. The 300 Dalton species isFA-Glu with a 14 carbon fatty acid tail. The 600 Dalton species is adimer of the 300 Dalton species. The 314 Dalton species is FA-Glu with a15 carbon fatty acid tail. The 628 Dalton species is a dimer of the 314Dalton species.

See FIG. 6 for an MS/MS analysis of the 314 Dalton and 328 Daltonspecies: The 314 species fragments into one species that has Gly+CH₃COand a second species that is the expected size of the remainder of thefatty acid (labeled “-Gly”). The 328 species fragments into one speciesthat has Gly+CH₃CO and a second species that is the expected size of theremainder of the fatty acid (labeled “-Gly”).

Example 9

The Bacillus α-keto acid dehydrogenase activity was knocked out bydeleting the genes that encode two enzymes bkdAA and bkdAB. These genesencode the Bacillus E1□ and E1β components of α-keto acid dehydrogenase(also known as branched chain □-oxo acid dehydrogenase) see Ref 30.These genes were knocked out in a strain that produces an acyl aminoacid surfactant called FA-Glu, which is composed of fatty acid (FA)linked to the amino acid glutamic acid (Glu).

As is shown in Table A for the control strain (which retains α-keto aciddehydrogenase activity), the surfactant is composed of a population ofmolecules with fatty acid tails that vary in length from C12 to C17,with C14 predominant (55%). When the mutant strain (which lacks α-ketoacid dehydrogenase activity) is fed 20 mM isobutyrate the fatty acidcomposition of the surfactant population narrows to about 95% C14.Surfactants with a fatty acid tail length of C14 are particularly usefulfor certain applications, such as use in personal care products such asshampoos, body washes and other products. The population of surfactantfatty acid tail lengths can be specifically modified by feeding themutant strain a starter keto acid that results in production of oddnumbered branched fatty acids. Specifically, a population of surfactantmolecules with a fatty acid tail composition of C13:0, 27%; C15:0, 65%was produced upon feeding the mutant 20 mM 2-methylbutyric acid. Thus,the strain produced surfactant with over 90% odd numbered branched fattyacid tails (presumably anteiso). A population of surfactant moleculeswith a fatty acid tail composition of C12:0, 3.71%; C14:0, 76.04%;C16:0, 2.20% was produced upon feeding the mutant 100 μM 2-methylbutyricacid. Thus, the strain produced surfactant with over 80% even numberedfatty acid tails. Given that the mutant strain is incapable of producingbranched fatty acids with even numbered chain lengths, and was fed aketo acid that can only be used to produce odd numbered branched fattyacids this population of even numbered fatty acid molecules is comprisedof straight chain (unbranched) fatty acids. Feeding of 20 mM isovalericproduced surfactant with over 90% odd numbered branched fatty acid tails(presumably iso). Feeding of 100 μM isovaleric produced surfactant withover 80% even numbered (straight chain) fatty acid tails.

We have demonstrated previously that acylases can be used tospecifically cleave an acyl amino acid surfactant to generate a freefatty acid and an amino acid. This approach can be used with thesurfactant populations described above to produce particular purifiedpopulations of fatty acids, for example a population composed of over90% C14 fatty acid or a population composed of over 90% anteiso C13 andC15 or over 90% iso C13 and C15, or over 80% straight chain (evennumbered fatty acids).

Experimental Details:

In this example, we amplified the genomic region of B. subtilis strainOKB105 encoding for the bkdAA and bkdAB genes and upstream anddownstream flanking genes (buk, lpdV, bkdB, bmrR, and bmr) using primers47014:5′-AATATCGTATTGAATAGACAGACAGG-3′ (SEQ ID NO: 13) and47015:5′-ATCTTTATTTGCATTATTCGTGGAT-3′ (SEQ ID NO: 14). The resulting PCRwas used as a template to amplify both upstream and downstreamfragments.

The upstream fragment was amplified using primers47020:5′-GTGTAAATCATTTAATGAAAAAAGGAAAAATTGACGTG-3′ (SEQ ID NO: 15) and47023:5′-ATCATTAAGCCTTCCTGGCAGTCAGCCCTAGTGCTTGATGTCGGTTTG-3′ (SEQ ID NO:16). The downstream fragment was amplified using primers47026:5′-AATTAAAAGCCATTGAGGCAGACGTAAGGGAGGATACAATCATGGCAATT-3′ (SEQ IDNO: 17) and 47021:5′-GGTATTCTTGCTGACAACGGTACATTCATATG-3′ (SEQ ID NO:18). The genes encoding for UPP/Kan were amplified from the templatepUC19-UPP-KAN using primers47024:5′-ACACGATATAGCCAGGAAGGCGGGTTTTTTGACGATGTTCTTGAAACTC-3′ (SEQ IDNO: 19)_and 47025:5′-AATTAAAAGCCACAAAGGCCTAGGTACTAAAACAATTCATCCAGTAA-3′(SEQ ID NO: 20).

The upstream, downstream and UPP/Kan fragments were all digested tocompletion with restriction endonuclease BglI. All 3 fragments weresubsequently ligated together with T4 DNA ligase. The ligated DNA mixwas transformed into FA-Glu producing strain 43074-B2 and transformantswere selected for ability to grow on LB agar supplemented with Kanamycin(30 ug/mL) and Isobutyric, Isovaleric and 2-methylbutyric acids (100uM). One strain derived from this strategy, which had the correctsequence to replace bkdAA and bkdAB with UPP/Kan, was named 47392-A6 andwas used in subsequent experiments.

47392-A6 was grown alongside 43074-B2 in S7(Phos7.5) (minimal mediacontaining 100 mM Potassium Phosphate Buffer pH 7.5, 10 mM AmmoniumSulfate, 20 mM Monosodium Glutamate, 2% Glucose and trace metals)supplemented with 0, 100 uM, 1 mM, 5 mM or 20 mM 2-methylbutyric,Isovaleric, Isobutyric acids (all neutralized to pH 7.5) in 10 mMcultures for 4 days at 37 C.

TABLE A FA-Glu 344 = C12 358 = C13 372 = C14 386 = C15 400 = C16 414 =C17 (mg/L) Control No Acid 1.60% 16.29% 54.78% 26.02% 1.19% 0.12% 439.2100 uM 1.76% 18.27% 52.08% 26.68% 1.09% 0.12% 397.1 2-methylbutyric 1 mM1.25% 23.84% 34.54% 39.28% 0.74% 0.35% 443.8 2-methylbutyric 5 mM 0.99%26.91% 22.05% 49.22% 0.38% 0.46% 409.6 2-methylbutyric 20 mM 0.57%26.79% 16.49% 55.19% 0.30% 0.65% 333.6 2-methylbutyric 100 uM Isovaleric1.66% 17.42% 53.04% 26.70% 1.05% 0.12% 451.4 1 mM Isovaleric 1.15%24.84% 39.84% 33.28% 0.75% 0.15% 437.6 5 mM Isovaleric 0.64% 34.26%19.87% 44.67% 0.33% 0.22% 434.4 20 mM Isovaleric 0.53% 34.06% 8.55%56.54% 0.14% 0.19% 338.5 100 uM Isobutyric 1.72% 15.64% 58.19% 23.08%1.23% 0.13% 457.1 1 mM Isobutyric 1.53% 11.44% 63.98% 21.51% 1.45% 0.10%470.1 5 mM Isobutyric 1.55% 9.43% 69.63% 17.76% 1.53% 0.09% 433.2 20 mMIsobutyric 1.33% 9.09% 69.83% 17.86% 1.82% 0.07% 434.5 Mutant No Acid nogrowth observed 100 uM 3.71% 10.41% 76.04% 7.56% 2.20% 0.07% 401.42-methylbutyric 1 mM 2.38% 25.73% 32.49% 38.46% 0.57% 0.36% 441.42-methylbutyric 5 mM 1.00% 31.76% 10.00% 56.32% 0.21% 0.71% 415.22-methyl butyric 20 mM 0.68% 27.28% 6.37% 64.77% 0.17% 0.73% 307.22-methylbutyric 100 uM Isovaleric 3.53% 8.30% 78.33% 7.89% 1.93% 0.02%417.9 1 mM Isovaleric 1.28% 22.86% 36.65% 38.72% 0.43% 0.06% 370.8 5 mMIsovaleric 0.48% 38.41% 11.76% 49.02% 0.20% 0.13% 425.8 20 mM Isovaleric0.31% 36.41% 4.14% 58.89% 0.09% 0.16% 334.9 100 uM 2.88% 5.96% 84.74%4.67% 1.72% 0.03% 250.1 Isobutyric 1 mM Isobutyric 2.34% 3.37% 90.10%2.08% 2.08% 0.02% 420.3 1 mM Isobutyric 1.82% 0.66% 94.03% 1.01% 2.48%0.01% 433.0 20 mM Isobutyric 1.68% 0.30% 94.50% 0.81% 2.69% 0.02% 390.7

BKD Up-U/K-Down Sequence Using Restriction Sites (SEQ ID NO: 21):

AATATCGTATTGAATAGACAGACAGGAGTGAGTCACCATGGCAACTGAGTATGACGTAGTCATTCTGGGCGGCGGTACCGGCGGTTATGTTGCGGCCATCAGAGCCGCTCAGCTCGGCTTAAAAACAGCCGTTGTGGAAAAGGAAAAACTCGGGGGAACATGTCTGCATAAAGGCTGTATCCCGAGTAAAGCGCTGCTTAGAAGCGCAGAGGTATACCGGACAGCTCGTGAAGCCGATCAATTCGGAGTGGAAACGGCTGGCGTGTCCCTCAACTTTGAAAAAGTGCAGCAGCGTAAGCAAGCCGTTGTTGATAAGCTTGCAGCGGGTGTAAATCATTTAATGAAAAAAGGAAAAATTGACGTGTACACCGGATATGGACGTATCCTTGGACCGTCAATCTTCTCTCCGCTGCCGGGAACAATTTCTGTTGAGCGGGGAAATGGCGAAGAAAATGACATGCTGATCCCGAAACAAGTGATCATTGCAACAGGATCAAGACCGAGAATGCTTCCGGGTCTTGAAGTGGACGGTAAGTCTGTACTGACTTCAGATGAGGCGCTCCAAATGGAGGAGCTGCCACAGTCAATCATCATTGTCGGCGGAGGGGTTATCGGTATCGAATGGGCGTCTATGCTTCATGATTTTGGCGTTAAGGTAACGGTTATTGAATACGCGGATCGCATATTGCCGACTGAAGATCTAGAGATTTCAAAAGAAATGGAAAGTCTTCTTAAGAAAAAAGGCATCCAGTTCATAACAGGGGCAAAAGTGCTGCCTGACACAATGACAAAAACATCAGACGATATCAGCATACAAGCGGAAAAAGACGGAGAAACCGTTACCTATTCTGCTGAGAAAATGCTTGTTTCCATCGGCAGACAGGCAAATATCGAAGGCATCGGCCTAGAGAACACCGATATTGTTACTGAAAATGGCATGATTTCAGTCAATGAAAGCTGCCAAACGAAGGAATCTCATATTTATGCAATCGGAGACGTAATCGGTGGCCTGCAGTTAGCTCACGTTGCTTCACATGAGGGAATTATTGCTGTTGAGCATTTTGCAGGTCTCAATCCGCATCCGCTTGATCCGACGCTTGTGCCGAAGTGCATTTACTCAAGCCCTGAAGCTGCCAGTGTCGGCTTAACCGAAGACGAAGCAAAGGCGAACGGGCATAATGTCAAAATCGGCAAGTTCCCATTTATGGCGATTGGAAAAGCGCTTGTATACGGTGAAAGCGACGGTTTTGTCAAAATCGTGGCTGACCGAGATACAGATGATATTCTCGGCGTTCATATGATTGGCCCGCATGTCACCGACATGATTTCTGAAGCGGGTCTTGCCAAAGTGCTGGACGCAACACCGTGGGAGGTCGGGCAAACGATTCACCCGCATCCAACGCTTTCTGAAGCAATTGGAGAAGCTGCGCTTGCCGCAGATGGCAAAGCCATTCATTTTTAAAAGCATAAAGGAGGGGCTTGAATGAGTACAAACCGACATCAAGCACTAGGGCTGACTGCCAGGAAGGC GGGTTTTTTGACG 1200 1201ATGTTCTTGAAACTCAATGTCTTTTTTTGTAGAATCAATAGAAGTGTGTA 1250 1251ATTGTTGATGGGACAATAAAAAAGGAGCTGAAACACAGTATGGGAAAGGT 1300 1301TTATGTATTTGATCATCCTTTAATTCAGCACAAGCTGACATATATACGGA 1350 1351ATGAAAATACAGGTACGAAGGATTTTAGAGAGTTAGTAGATGAAGTGGCT 1400 1401ACACTCATGGCATTTGAAATTACCCGCGATCTTCCTCTGGAAGAAGTGGA 1450 1451TATCAATACACCGGTTCAGGCTGCGAAATCGAAAGTCATCTCAGGGAAAA 1500 1501AACTCGGAGTGGTTCCTATCCTCAGAGCAGGATTGGGAATGGTTGACGGC 1550 1551ATTTTAAAGCTGATTCCTGCGGCAAAAGTGGGACATGTCGGCCTTTACCG 1600 1601TGATCCAGAAACCTTAAAACCCGTGGAATACTATGTCAAGCTTCCTTCTG 1650 1651ATGTGGAAGAGCGTGAATTCATCGTGGTTGACCCGATGCTCGCTACAGGC 1700 1701GGTTCCGCAGTTGAAGCCATTCACAGCCTTAAAAAACGCGGTGCGAAAAA 1750 1751TATCCGTTTCATGTGTCTTGTAGCAGCGCCGGAGGGTGTGGAAGAATTGC 1800 1801AGAAGCATCATTCGGACGTTGATATTTACATTGCGGCGCTAGATGAAAAA 1850 1851TTAAATGAAAAAGGATATATTGTTCCAGGTCTCGGAGATGCGGGTGACCG 1900 1901CATGTTTGGAACAAAATAAAAAATGAAATCCCCAAAAGGGGGTTTCATTT 1950 1951TTTTATCCAGTTTTTTGCTATTCGGTGAATCTGTATACAATTATAGGTGA 2000 2001AAATGTGAACATTCTGGGATCCGATAAACCCAGCGAACCATTTGAGGTGA 2050 2051TAGGTAAGATTATACCGAGGTATGAAAACGAGAATTGGACCTTTACAGAA 2100 2101TTACTCTATGAAGCGCCATATTTAAAAAGCTACCAAGACGAAGAGGATGA 2150 2151AGAGGATGAGGAGGCAGATTGCCTTGAATATATTGACAATACTGATAAGA 2200 2201TAATATATCTTTTATATAGAAGATATCGCCGTATGTAAGGATTTCAGGGG 2250 2251GCAAGGCATAGGCAGCGCGCTTATCAATATATCTATAGAATGGGCAAAGC 2300 2301ATAAAAACTTGCATGGACTAATGCTTGAAACCCAGGACAATAACCTTATA 2350 2351GCTTGTAAATTCTATCATAATTGTGGTTTCAAAATCGGCTCCGTCGATAC 2400 2401TATGTTATACGCCAACTTTCAAAACAACTTTGAAAAAGCTGTTTTCTGGT 2450 2451ATTTAAGGTTTTAGAATGCAAGGAACAGTGAATTGGAGTTCGTCTTGTTA 2500 2501TAATTAGCTTCTTGGGGTATCTTTAAATACTGTAGAAAAGAGGAAGGAAA 2550 2551TAATAAATGGCTAAAATGAGAATATCACCGGAATTGAAAAAACTGATCGA 2600 2601AAAATACCGCTGCGTAAAAGATACGGAAGGAATGTCTCCTGCTAAGGTAT 2650 2651ATAAGCTGGTGGGAGAAAATGAAAACCTATATTTAAAAATGACGGACAGC 2700 2701CGGTATAAAGGGACCACCTATGATGTGGAACGGGAAAAGGACATGATGCT 2750 2751ATGGCTGGAAGGAAAGCTGCCTGTTCCAAAGGTCCTGCACTTTGAACGGC 2800 2801ATGATGGCTGGAGCAATCTGCTCATGAGTGAGGCCGATGGCGTCCTTTGC 2850 2851TCGGAAGAGTATGAAGATGAACAAAGCCCTGAAAAGATTATCGAGCTGTA 2900 2901TGCGGAGTGCATCAGGCTCTTTCACTCCATCGACATATCGGATTGTCCCT 2950 2951ATACGAATAGCTTAGACAGCCGCTTAGCCGAATTGGATTACTTACTGAAT 3000 3001AACGATCTGGCCGATGTGGATTGCGAAAACTGGGAAGAAGACACTCCATT 3050 3051TAAAGATCCGCGCGAGCTGTATGATTTTTTAAAGACGGAAAAGCCCGAAG 3100 3101AGGAACTTGTCTTTTCCCACGGCGACCTGGGAGACAGCAACATCTTTGTG 3150 3151AAAGATGGCAAAGTAAGTGGCTTTATTGATCTTGGGAGAAGCGGCAGGGC 3200 3201GGACAAGTGGTATGACATTGCCTTCTGCGTCCGGTCGATCAGGGAGGATA 3250 3251TCGGGGAAGAACAGTATGTCGAGCTATTTTTTGACTTACTGGGGATCAAG 3300 3301CCTGATTGGGAGAAAATAAAATATTATATTTTACTGGATGAATTGTTTTA 3350 3351GTACCTAGGCCTTTG AGGCAGACGTAAGGGAGGATACAATCATGGCAATTGAACAAATGACGATGCCGCAGCTTGGAGAAAGCGTAACAGAGGGGACGATCAGCAAATGGCTTGTCGCCCCCGGTGATAAAGTGAACAAATACGATCCGATCGCGGAAGTCATGACAGATAAGGTAAATGCAGAGGTTCCGTCTTCTTTTACTGGTACGATAACAGAGCTTGTGGGAGAAGAAGGCCAAACCCTGCAAGTCGGAGAAATGATTTGCAAAATTGAAACAGAAGGCGCGAATCCGGCTGAACAAAAACAAGAACAGCCAGCAGCATCAGAAGCCGCTGAGAACCCTGTTGCAAAAAGTGCTGGAGCAGCCGATCAGCCCAATAAAAAGCGCTACTCGCCAGCTGTTCTCCGTTTGGCCGGAGAGCACGGCATTGACCTCGATCAAGTGACAGGAACTGGTGCCGGCGGGCGCATCACACGAAAAGATATTCAGCGCTTAATTGAAACAGGCGGCGTGCAAGAACAGAATCCTGAGGAGCTGAAAACAGCAGCTCCTGCACCGAAGTCTGCATCAAAACCTGAGCCAAAAGAAGAGACGTCATATCCTGCGTCTGCAGCCGGTGATAAAGAAATCCCTGTCACAGGTGTAAGAAAAGCAATTGCTTCCAATATGAAGCGAAGCAAAACAGAAATTCCGCATGCTTGGACGATGATGGAAGTCGACGTCACAAATATGGTTGCATATCGCAACAGTATAAAAGATTCTTTTAAGAAGACAGAAGGCTTTAATTTAACGTTCTTCGCCTTTTTTGTAAAAGCGGTCGCTCAGGCGTTAAAAGAATTCCCGCAAATGAATAGCATGTGGGCGGGGGACAAAATTATTCAGAAAAAGGATATCAATATTTCAATTGCAGTTGCCACAGAGGATTCTTTATTTGTTCCGGTGATTAAAAACGCTGATGAAAAAACAATTAAAGGCATTGCGAAAGACATTACCGGCCTAGCTAAAAAAGTAAGAGACGGAAAACTCACTGCAGATGACATGCAGGGAGGCACGTTTACCGTCAACAACACAGGTTCGTTCGGGTCTGTTCAGTCGATGGGCATTATCAACTACCCTCAGGCTGCGATTCTTCAAGTAGAATCCATCGTCAAACGCCCGGTTGTCATGGACAATGGCATGATTGCTGTCAGAGACATGGTTAATCTGTGCCTGTCATTAGATCACAGAGTGCTTGACGGTCTCGTGTGCGGACGATTCCTCGGACGAGTGAAACAAATTTTAGAATCGATTGACGAGAAGACATCTGTTTACTAAATAAGCAAAAAGAGCATTTTTTGAAGTTTTGTTTCAAAAAATGCTCTTTTTCTATGCTTTATTATTCAGCGATCCGTATTTTCATTTCGACTCGATATTCTTCTTGTTTTTTCGGGGAGTAATGAATCGGTATGATTAACTCGTATACATCACTGACAACTGTTAATTGGCGGTCCGCGATATATTTGATAAGCTTCTGTAAGTTGAGAAAATAATGTTCAGGCGAAAAATTATACGCGATACACGCATACCTCCCTTTAGGGATCGTTGTGATTTCCATATCCGGCGTAATTGATGAAATCTGTTTATTCGTCAATACAGGTGTGAAAATATGACGGTAAGTCATTTCATCAATGCTGGTGTAGGGCTGAAAAGAGAAAGTAGCGCCGTAGCTATTGTTCGTAAATCCATCTGCTGACTCGATAAATTTTTTTAATTTGCTGTAGGAGGCGTTGAGCACGTTTTCAGGCCCGATTCCTTCTGCCTCTGTCTGAATGATCCGTATTTCTTCTTCATCTAAAACAAACACCTCACCGAGCGCGGGATATTCCATCTGCCGTTTCATCCGCTTTTTCACCAATGAAATGGTTTGCTCCAGGGCTGATAAAAAGTCTAATTTCTCCCTGATTTGCCTCTCCTGCTCTGTATAAAAAGCAAACAGTTCTTCCATCTCTAAGTCCTGTGCTTTTTTCATCTCTTCTAAAGGTGTGCCGATATATTTCAATGATTTGATCAAATCCAGATGAATGAGCTGAGAATCTGTATAATAGCGGTAGCTGGTATCCGGGTCGACGTAGGCTGGTTTAAATAAATCAATTTTATCGTAATAACGGAGCGCTTTTATCGACACGTTTGCCAGTTTTGATACTTCCCCAATTGAGTAATACGATTCCTTCATGCCATCACTCCTTCTATCATCAGTATAAAGAAGAAGCGCATTCTTTGCAGTACACAAAGAATGCGCTTCTTATCACGTGCTGGCTTTAAGATGTGCAGGCGCTTTCCAAGCAATGGTCAGTGCAATCCCTATGGCTAAGGTGACCGTTGCAAAGTAGAAAGGATAGTTTACATCTATATCGAACAGCATTCCGCCGATAATAGGCCCGAATACATTGCCGATACTTGTAAACATTGAATTCATACCGCCGGCAAACCCCTGTTCATTTCCCGCAATCTTTGACAGGTAAGTCGTTACCGCAGGCCGCATGAGATCAAATCCGACAAATACGGTGACTGTCACCAGCAGAATCGCAACATATGAATGTACCGTTGTCAGCAAGAATACCAGACTCGTCGAGAGAATTAAGCTGTACCGAATTAAATGAATTTCGCCAAACCATCTTGTGAAGCGGTCGAATAAGACGACTTGCGTAATGGCGCCAACAATCGCTCCTCCTGTAATCATAATGGCAATGTCGCTGGCCGTAAATCCGAATTTATGATCCACGAATAATGCAAATAAAGAT (SEQ ID NO: 21):

Example 10

The following genes were deleted by replacing the coding sequence ofeach gene with a upp/kan cassette. The effect on FA-Glu yield is shownin Table #: Maf, Abh, RocG, degU, RapC, eps, yngF, yhaR, mmgB. spxA.

An additional copy of each of the following gene was introduced intoBacillus under the control of either a constitutive promoter (e.g.,PgroEL or under the control of the Psrf promoter, which normallycontrols expression of genes in the srf operon (which genes are requiredfor production of surfactin). The effect on FA-Glu yield is shown inTable B:

FA-Glu Increase relative to Single Knockouts parental strain Ave RapC34.1% 25.9% 17.6% plip 21.2% 21.2% yqxM 19.7% 19.7% eps 19.1% 19.1% degU13.3% 18.0% 22.8% yngF 14.5% 14.5% RocG 12.0% 12.0% yhaR 13.3% 11.5%9.6% mmgB 11.4% 11.4% 6.2% 9.7% abh 16.0% 6.9% maf 15.6% 8.0% 0.5%spoIIAC 7.8% 7.8% fapR 3.3% 4.8% 6.3% spxA 2.7% 2.7% FA-Glu Increaserelative to Knockin parental strain eps−>pGroEL-lcfA 11.5%amyE−>Pspac-srfD 12.7% amyE−>PgroEL-sfp-srfD 44.3% phe+ 79.2% NOTE: AllSingle knockouts are in the 43074-B2 background that contains 1) plipKO, 2) phe+, 3)amyE−>PgroEL-sfp-srfD and 4) spoIIAC KO

REFERENCES

-   1. Krass et al., “Functional Dissection of Surfactin Synthetase    Initiation Module Reveals Insights into the Mechanism of    Lipoinitiation” Chemistry & Biology, 17:872-880, 2010.-   2. Roongsawang et al., “Phylogenetic analysis of condensation    domains in nonribosomal peptide synthetases” FEMS Microbiology    Letters, 252:143-151, 2005.-   3. Rausch et al., “Phylogenetic analysis of condensation domains in    NRPS sheds light on their functional evolution” BMC Evolutionary    Biology 7(78): 1-15, 2007.-   4. Segolene et al., “NORINE: a database of nonribosomal peptides.”    Nucleic Acids Research, 36: D327-D331, 2008.-   5. Steller, et al., “initiation of Surfactin Biosynthesis and the    Role of the SrfD-Thioesterase Protein.” Biochemistry,    43:11331-11343, 2004.-   6. Duitman et al., “The Mycosubtilin synthetase of Bacillus subtilis    ATCC6633: A Multifunctional Hybrid Between a Peptide Synthetase, an    Amino Transferase, and a fatty Acid Synthase” PNAS,    96(23):13294-13299, 1999.-   7. Hansen et al., “The Loading Module of Mycosubtilin: An    adenylation Domain with fatty Acid Selectivity” J Am Chem Soc,    129(20): 6366-6367, 2007.-   8. Aron et al., “FenF: Servicing the Mycosubtilin Synthetase    Assembly Line in trans” ChemBioChem, 8: 613-616, 2007.-   9. Wittmann et al., “Role of DptE and DptF in the lipidation    reaction of daptomycin” FEBS Journal, 275:5343-5353, 2008.-   10. Kleinkauf et al., “A nonribosomal system of peptide    biosynthesis” Eur J Biochem, 236: 335-351, 1996.-   11. Miao et al., “Daptomycin biosynthesis in Streptomyces    roseosporus: cloning and analysis of the gene cluster and revision    of peptide stereochemistry” Microbiology, 151: 1507-1523, 2005.-   12. Zohreh et al., “Structure, biosynthetic origin, and engineering    Biosynthesis of calcium-Dependent Antibiotics from Streptomyces    coelicolor” Chemistry & Biology, 9:1175-1187, 2002.-   13. Kaneda, “Iso- and anteiso-fatty acids in bacteria: biosynthesis,    function, and taxonomic significance,” Microbiological Reviews,    55(2):288-302, 1991.-   14. Isolation and characterization of the -ketoacyl-acyl carrier    protein synthatse III gene (fabH) from Escherichia coli K-12, Tsay,    et al., JBC, 267(10), 6807-6814, 1992.-   15. Choi et al., “□-ketoacyl-acyl carrier protein synthase III    (FabH) is a determining factor in branched-chain fatty acid    biosynthesis,” Journal of Bacteriology, 182(2):365-370, 2000.-   16. Kaneda, “Fatty acids of the genus Bacillus: an example of    branched-chain preference,” Bacteriological Reviews, 41(2):391-418,    1977.-   17. Willecke et al., “Fatty acid-requiring mutant of bacillus    subtilis defective in branched chain α-keto acid dehydrogenase,” The    Journal of Biological Chemistry, 246(17):5264-5272.-   18. Alteration of the fatty acid profile of Streptomyces coelicolor    by replacement of the initiation enzyme 3-ketoacyl acyl carrier    protein synthase III (FabH).-   19. Relationship of primer specificity of fatty acid de novo    synthetase to fatty acid composition in 10 species of bacteria and    yeasts. Kaneda dn Smith. Can. J. Microbiol., Vol 26, 1980.-   20. Application of cellular fatty acid analysis, david welch,    Clinical Microbiology Reviews, October 1991, 422-438.-   21. Identification, substrate specificity, and inhibition of the    Streptococcus pneumonia b-ketoacyl-acp carrier protein synthase III    (FabH), Khandekar, et al., JBC, 276(32), 2001.-   22. Fatty Acid Biosynthesis in Pseudomonas aeruginosa Is Initiated    by the FabY Class of -Ketoacyl Acyl Carrier Protein Synthases-   23. Identification and functional expression of a 9-fatty acid    desaturase from Psychrobacter urativorans in Escherichia coli, Li et    al., Lipids, 43, 207-213, 2008.-   24. Δ9-fatty acid desaturase from arachidonic acid-producing fungus    unique gene sequence and its heterologous expression in a fungus,    Aspergillus, Sakuradani et al., Eur. J. Biochem., 260, 208-219,    1999.-   25. A Bacillus subtilis gene induced by cold shock encodes a    membrane phospholipid desaturase, Aguilar et al., Journal of    bacteriology, 180(8):2194-2200, 1998.-   26. Martin, et al., “a lipA (yutB) mutant, encoding lipoic acid    synthase, provides insight into the interplay between branched-chain    and unsaturated fatty acid biosynthesis in Bacillus subtilis,”    Journal of Bacteriology, 191(24):7447-7455, 2009.-   27. Lee et al., “Cloning of srfA operon from Bacillus subtilis C9    and its expression in E. coli,” Appl Microbiol Biotechnol,    75(3):567-572, 2007.-   28. Quadri et al., “Characterization of Sfp, a Bacillus subtilis    phosphopantetheinyl transferase for peptididyl carrier domains in    peptide synthetases,” Biochemistry, 37(6):1585-1595, 1998.-   29. Reznik et al., “Use of sustainable chemistry to produce an acyl    amino acid surfactant,”, Appl Microbiol Biotechnol, published    online, 2010.-   30. Wang, et al., “The primary structure of branched-chain a-oxo    acid dehydrogenase from bacillus subtilis and its similarity to    other a-oxo acid dehydrogenases,” Eur. J. Biochem., 213:1091-1099,    1993.-   31. This reference is out of order relative to its position in the    text. Felnagle, et al., “Identification of the biosynthetic gene    cluster and an additional gene for resistance to the    antituberculosis drug capreomycin,” Applied and Environmental    Microbiology, 73(13):4162-4170, 2007.-   32. This reference is out of order relative to where it is mentioned    in the text. Beasley et al., “Mutation of L-2,3-diaminopropionic    acid synthase genes blocks staphyloferrin B synthesis in    Staphylococcus aureus,” BMC Microbiology, 11:199, 2011.-   33. This reference is out of order relative to its position in the    text. Simon and Shokat, “A method to site-specifically incorporate    methyl-lysine analogues into recombinant proteins,” Methods in    Enzymology, Volue 512, Nuclososomes: Histones & Chromatin, Part A,    edited Carl Wu and C. David Allis, Elsevier, Inc., 2012.-   34. This reference is out of order relative to its position in the    text. Zhang et al., “Catalytic promiscuity of a bacterial    □-N-methyltransferase,” FEBS Letters, 586:3391-3397, 2012.-   35. This reference is out of order relative to its position in the    text. Komiyama, et al., “A new antibiotic, cepemycin taxonomy,    fermentation, isolation and biological characteristics,” The Journal    of Antibiotics, 46(11): 1666-1671, 1993.

Sequence Listing Proteins for synthesis of 2,3-diaminopropionic acidsbnA >sp|Q7A1Z6|SBNA STAAW Probable siderophore biosynthesis protein SbnAOS = Staphylococcus aureus (strain MW2) GN = sbnA PE = 3 SV = 1MIEKSQACHDSLLDSVGQTPMVQLHQLFPKHEVFAKLEYMNPGGSMKDRPAKYIIEHGIKHGLITENTHLIESTSGNLGIALAMIAKIKGLKLICVVDPKISPINLKIIKSYGANVEMVEEPDAHGGYLMTRIAKVQELLATIDDAYWINQYANELNWQSHYHGAGTEIVETIKQPIDYFVAPVSTIGSIMGMSRKIKEVHPNAQIVAVDAKGSVIFGDKPINRELPGIGASRVPEILNRSEINQVIHVDDYQSALGCRKLIDYEGIFAGGSTGSIIAAIEQLITSIEEGATIVTILPDRGDRYLDLVYSDTWLEKMKSRQGVKSE (SEQ ID NO: 22)sbnB >tr|Q6X7U6|Q6X7U6_STAAU SbnB OS = Staphylococcus aureus GN =sbnB PE = 4 SV = 1MNREMLYLNRSDIEQAGGNHSQVYVDALTEALTAHAHNDFVQPLKPYLRQDPENGHIADRIIAMPSHIGGEHAISGIKWIGSKHDNPSKRNMERASGVIILNDPETNYPIAVMEASLISSMRTAAVSVIAAKHLAKKGFKDLTIIGCGLIGDKQLQSMLEQFDHIERVFVYDQFSEACARFVDRWQQQRPEINFIATENAKEAVSNGEVVITCTVIDQPYIEYDWLQKGAFISNISIMDVHKEVFIKADKVVVDDWSQCNREKKTINQLVLEGKFSKEALHAELGQLVTGDIPGREDDDEIILLNPMGMAIEDISSAYFIYQQAQQQNIGTTLNLY (SEQ ID NO: 23)ZmaU >gi|223047493|gb|ACM79820.1| ZmaU [Bacillus cereus]MSFRYKFYLKYIRKNIYTYLSLIIFLDFNQERKQIMLKKLESLERVIGNIPMIKLEHEKINLYAKLEYYNLMNSVKVRAAYHILKSAINRGEVNENSTIIESSSGNFAVALATLCRYIGLKFIPVIDPNINDSYENFLRATSYQVANVDERDEIGGYLLTRLNKVKELLNTIPNAYWINQYNNADNFEAHYQGIGGEISNDFKQLDYAFIGVSIGGTIAGVSTRLKEKFPNIKIIAVDSQGSIIFGDKPRKRYIPGIGASMIPGMVKKALIDDVMIVPEVHTVAGCYELFNKHAIFAGGSSGTSYYAIQKYFENRDVQNTPNVVFLCPDNGQAYISTIYNVEWVEWLNIQKSVEDQLVSL (SEQ ID NO: 24) ZmaV >gi|223047494|gb|ACM79821.1|ZmaV [Bacillus cereus]MMYLNIKHENEMGVNWEETINVISKAVKSLDSEDFSQPIKPYLRFDDPANRIIAMPAYIGGEFKVSGIKWIASFPKNIEKGIQRAHSVTILNDAMIGKPFATLNTAMVSVIRTASVTGLMIREFAKLRDLNNVKVGIIGFGPIGQMHLKMVTALLGDKIEGVYLYDINGIKDELIPEEIYSKTQKVNAYEEAYNDADIFITCTVSAEGYIDKKPKDGALLLNVSLRDFKPDILEYTKSLVVDNWEEVCREKTDVERMHLERGLQKEDIVSIADVVIRGALONFPYDKAILFNPMGMAIFDVAIAAYYYQRARENEMGVLLED (SEQ ID NO: 25)MethyltrasferasesBacillus prmA >gnl|BSUB|BSU25450-MONOMER ribosomal protein L11 methyltransferase(complement(2624760..2623825)) Bacillus subtilis subtilis 168MKWSELSIHT THEAVEPISN ILHEAGASGV VIEDPLDLIK ERENVYGEIY QLDPNDYPDEGVIVKAYLPV NSFLGETVDG IKETINNLLL YNIDLGRNHI TISEVNEEEW ATAWKKYYHPVKISEKFTIV PTWEEYTPVH TDELIIEMDP GMAFGTGTHP TTVLCIQALE RFVQKGDKVIDVGTGSGILS IAAAMLEAES VHAYDLDPVA VESARLNLKL NKVSDIAQVK QNNLLDGIEGEHDVIVANIL AEVILRFTSQ AYSLLKEGGH FITSGIIGHK KQEVKEALEQ AGFTIVEILSMEDWVSIIAK K (SEQ ID NO: 26)E. coli prmA >gnl|ECOLI|EG11497-MONOMER methyltransferase for 50S ribosomal subunit proteinL11 3407092..3407973 Escherichia coli K-12 substr. MG1655MPWIQLKLNT TGANAEDLSD ALMEAGAVSI TFQDTHDTPV FEPLPGETRL WGDTDVIGLFDAETDMNDVV AILENHPLLG AGFAHKIEQL EDKDWEREWM DNFHPMRFGE RLWICPSWRDVPDENAVNVM LDPGLAFGTG THPTTSLCLQ WLDSLDLTGK TVIDFGCGSG ILAIAALKLGAAKAIGIDID PQAIQASRDN AERNGVSDRL ELYLPKDQPE EMKADVVVAN ILAGPLRELAPLISVLPVSG GLLGLSGILA SQAESVCEAY ADSFALDPVV EKEEWCRITG RKN (SEQ ID NO: 27)cypemycin methyltrasferase >sp|E5KIC0|CYPM_STRSQ Cypemycin methyltransferase OS =Streptomyces sp. GN = cypM PE = 1 SV = 1MSDPSVYDETAIEAYDLVSSMLSPGAGLVAWVSSHRPLDGRTVLDLGCGTGVSSFALAEAGARVVAVDASRPSLDMLEKKRLDRDVEAVEGDFRDLTFDSTFDVVTMSRNTFFLAQEQEEKIALLRGIARHLKPGGAAFLDCTDPAEFQRAGGDARSVTYPLGRDRMVTVTQTADRAGQQILSIFLVQGATTLTAFHEQATWATLAEIRLMARIAGLEVTGVDGSYAGEPYTARSREMLVVLERQ (SEQ ID NO: 28)Streptomyces griseus methyltransferase >gi|182440155|ref|YP_001827874.1|methyltransferase [Streptomyces griseus subsp. griseus NBRC 13350]MSEPTVYDAAAIDAYDLISSMLSPGAGLAAWVSSHRPLAGRTVLDLGAGTGVSSFALADAGAQVVAVDASRPSLDLLESRRGERKVDTVEADFRDLRLDSAFDVVTMSKNTFFLAQSHDEKIELLRAIGRHLKPGGAVFLDCTDPVEYLRADGAAHTVTYPLGREQMVTITQNADRATQAIMSIFMVQSASTLTSFHEMATWASLPEIRLLARAAGLEVTAVDGSYAGDAYTARSREMLVVLEAK (SEQ ID NO: 29)Proteins for initiation of straight chain fatty acid synthesisfadH family members for initiation of straight chain fatty acid synthesisM77744>M77744_1(M77744|pid:none) Escherichia coli beta-ketoacyl-acyl carrier proteinsynthase III (fabH) gene, complete cds.MYTKIIGTGSYLPEQVRTNADLEKMVDTSDEWIVTRTGIRERHIAAPNETVSTMGFEAATRAIEMAGIEKDQIGLIVVATTSATHAFPSAACQIQSMLGIKGCPAFDVAAACAGFTYALSVADQYVKSGAVKYALVVGSDVLARTCDPTDRGTIIIFGDGAGAAVLAASEEPGIISTHLHADGSYGELLTLPNADRVNPENSIHLTMAGNEVEKVAVTELAHIVDETLAANNLDRSQLDWLVPHQANLRIISATAKKLGMSMDNVVVTLDRHGNTSAASVPCALDEAVRDGRIKPGQLVLLEAFGGGFTWGSALVRF (SEQ ID NO: 30)AF384041 >sp|P0A3C5|FABH_STRPN 3-oxoacyl-[acyl-carrier-protein]synthase 3 OS =Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334/TIGR4) GN =fabH PE = 3 SV = 1MAFAKISQVAHYVPEQVVINHDLAQIMDINDEWISSRTGIRQRHISRTESTSDLATEVAKKLMAKAGITGEELDFIILATITPDSMMPSTAARVQANIGANKAFAFDLTAACSGFVFALSTAEKFIASGRFQKGLVIGSETLSKAVDWSDRSTAVLFGDGAGGVLLEASEQEHFLAESLNSDGSRSECLTYGHSGLHSPFSDQESADSFLKMDGRIVFDFAIRDVAKSIKQTIDESPIEVIDLDYLLLHQANDRILDKMARKIGVDRAKLPANMMEYGNISAASIPILLSECVEQGLIPLDGSQTVLLSGFGGGLIWGILILTI (SEQ ID NO: 31)fadY family members for initiation of straight chain fatty acid synthesisPA5174 >tr|Q9HU15|Q9HU15_PSEAE Probable beta-ketoacyl synthase OS =Pseudomonasaeruginosa (strain ATCC 15692/PAO1/1C/PRS 101/LMG 12228) GN =PA5174 PE = 3 SV = 1MSRLPVIVGEGGYNAAGRSSEHHGERRMVIESMDPQARQETLAGLAVMMKLVKAEGGRYLAEDGTPLSPEDIERRYAERIFASTLVRRIEPQYLDPDAVHWHKVLELSPAEGQALTFKASPKQLPEPLPANWSIAPAEDGEVLVSIHERCEFKVDSYRALTVKSAGQLPTGFEPGELYNSRFHPRGLQMSVVAATDAIRSTGIDWKTIVDNVQPDEIAVFSGSIMSQLDDNGFGGLMQSRLKGHRVSAKQLPLGENSMPTDFINAYVLGSVGMTGSITGACATFLYNLQKGIDVITSGQARVVIVGNSEAPILPECIEGYSAMGALATEEGLRLIEGRDDVDFRRASRPFGENCGFTLAESSQYVVLMDDELALRLGADIHGAVTDVFINADGFKKSISAPGPGNYLIVAKAVASAVQIVGLDTVRHASFVHAHGSSTPANRVIESEILDRVASAFGIDGWPVTAVKAYVGHSLATASADQLISALGTFKYGILPGIKTIDKVADDVHQQRLSISNRDMRQDKPLEVCFINSKGFGGNNASGVVLSPRIAEKMLRKRHGQAAFAAYVEKREQTRAAARAYDQRALQGDLEIIYNFGQDLIDEHAIEVSAEQVTVPGFSQPLVYKKDARFSDMLD (SEQ ID NO: 32)Pmen_0396 >pmy:Pmen_0396 pyrC; dihydroorotase (EC:3.5.2.3); K01465 d1hydroorotase[EC:3.5.2.3] (A)MRTAILGARVIDPASGLDQVTDLYIDGTKLVAFGQAPAGFTADKTLNAQGLIAAPGLVDLSVALREPGYSRKGSIATETLAAAAGGVTSLCCPPLTKPVLDTPAVAELILDRAREAGHTKVFPIGALSKGLAGEQLAELVALRDAGCVAFGNGLDNFRSARTLRRALEYAATFDLQVIFHSQDFDLAEGGLAHEGPTASFLGLAGIPETAETVALARDLLLVEQSGVRAHFSQITSARGAELIANAQARGLPVTADVALYQLILTDEALIDFSSLYHVQPPLRSRADRDGLREAVKAGVISAIASHHQPHERDAKLAPFAATEPGISSVQLQLPLAMSLVQDGLLDLPTLLARLSSGPAAALRLPAGTLSVGGAADIVLFDAQASTVAGEQWYSKGSNCPFIGHCLPGAVRYTLVDGHISYQS (SEQ ID NO: 33) MDS 0454 >pmk:MDS_0454 beta-ketoacyl synthase (A)MSRLPVIVGFGGYNAAGRSSFHHGFRRTVQESLEPQARQETLAGLAQMMKLVRVVDGQYQDQDGQPLSLADIESRYAKQILAGTLVRRIEKQHLDPDAAHWQKSIGVTPADGTSLSFLTQRKQLPEPLPANWSIEELEGNEVRVTLHDSCEFKVDSYRPLAVKSAGQLPTGFEPSELYNARFHPRGLAMTVVGVTDALRSVGIDWQRIVQHVAPDEIAVFASCIMSQLDENGFGGMMQSRLKGGRVTAKQLALGLNTMPADFINAYVLGSVGTTGSITGACATFLYNLQKGIEQIASGKARVVIVGSSEAPINQECIEGYGAMGALATEEGLRQIEGKSEVDFRRASRPFGDNCGFTLAEACQFVVLMDDELALELGADIHGAVPDVFINADGFKKSISAPGPGNYLTVAKAVASAVQLLGLDAVRNRSFVHAHGSSTPANRVTESEILDRVAAAFGIEQWPVTAVKAFVGHSLATASGDQVIGALGAFKYGIVPGIKTIDAVAGDVHQHHLSLSTEDRKVGDQALDVAFINSKGFGGNNASALVLAPHVTERMLRKRHGQAAFDAYLARREGTRAAAAAYDQQALQGKLDIIYNFGNDMIDDQAISITTEEVKVPGFDQPLVFRKDARYSDMLD (SEQ ID NO: 34)Psefu_4068 >tr|F6AJT1|F6AJT1_PSEF1 Beta-ketoacyl synthase OS =Pseudomonas fulva (strain 12-X) GN = Psefu_4068 PE = 3 SV = 1MKSRLPVIVGFGGYNAAGRSSFHHGFRRIVIESLDEQARQETLIGLAVMTKLVRVVDGRYQSQDGEALSPADIERRYGAQILASTLVRRIEKQHLDPDAAHWHKSIAVGGEAGSLIFVSSRKQLPEPLPANWTVEELGGNDVRVTLHDSCEFKVDSYRALPVKSAGQLPTGFEPGELYNSRFHPRGLQMAVVGVIDALRATGVPWQTIVDHVAPDEIAVFAGSIMSQLDENGFGGLMQSRLKGHRVSSKQLALGLNIMPADFINAYVLGSVGITGSVTGACATFLYNLQKGIEQINAGKARVVIVGNSEAPINAECIEGYGAMGALATEDGLRLIEGKDDVDFRRASRPFGENCGFILSEACQFVVLMDDELALQLGADIHGAATDVFINADGFKKSISAPGPGNYLIVAKAVAAATQLVGIDAVRRRSFVHAHGSSTPANRVIESELLDRVAAAFAIDSWPVAAVKAFVGHSLATASGDQVISALGTFKYGIIPGIKTIDEVAADVHQQHLSISNVDRHDQRMDVCFINSKGFGGNNASAVVLAPHVVERMLRKRHGEAAFSAYQQRREQTRANAQAYDEQATKGQLEIIYNFGNDLIDDTEIAIDDAQIKVPGFAQPLLYKQDDRYSDMLD (SEQ ID NO: 35)Avin_05510 >avn:Avin_05510 beta-ketoacyl synthase (A)MSRLPVIVGFGGYNSAGRSSFHHGFRRTVIESLTPQARQETLAGLAVMMKLVSVVDGQYRDSDGSTLTPAEIERRHGERILAATLIRRIERQYFDVDATHWHKSLTLSGEDQPLHFTTSAKQLPEPLPANWSVEPLEEHQVRVTIHGSCEFKVDSYREMPVKSAGQLPTGFEPGELYNSRFHPRGLQLSVVAATDALRSTGIDWQTILDHVQPDEVAVFSGSIMSQLDENGYGGLLQSRLKGHRVSSKQLPLGFNSMPTDFINAYVLGSVGSTGSITGACATFLYNLQKGIDVITSGQARVVVAGNAEAPITPEIVEGYAAMGALATEEGLRHIEGRDQVDFRRASRPFGANCGFTLAEAAQYVVLMDDSLALELGADIHGAVPDVFVNADGFKKSISAPGPGNYLTVAKAVASAMQLVGEDGVRQRSFIHAHGSSTPANRVTESELLDRVAGAFGIADWPVAAVKAYVGHSLATASGDQLISALGTFKYGLLPGIKTVDRFADDVHDQHLRLSMRDVRRDDLDVCFINSKGFGGNNATGVLLSPRVTEKMLRKRHGEAAFADYRSRREATREAARRYDEQVLQGRFDILYNFGQDMIDEHAIEVNEEGVKVPGFKQAIRFRKDERFGDMLD (SEQ ID NO: 36)PSPA7_5914 >pap:PSPA7_5914 putative beta-ketoacyl synthase (A)MSRLPVIVGFGGYNAAGRSSFHHGFRRMVIESMDPQARQETLAGLAVMMKLVKAEGGRYLAEDGTPLSPEDIERRYAERIFASTLVRRIEPQYLDPDAVHWHKVLEATPAEGQALTFKASPKQLPEPLPGNWSVTPAADGEVLVSIHERCEFKVDSYRPLTVKSAGQLPTGFEPGELYNSRFHPRGLQMSVVAATDAIRSTGIDWQTIVDNVQPDEIAVFSGSIMSQLDDNGFGGLMQSRLKGHRVSAKQLPLGFNSMPTDFINAYVLGSVGMTGSITGACATFLYNLQKGIDVITSGQARVVIVGNSEAPILPECIEGYSAMGALATEEGLRLIEGRDEVDFRRASRPFGENCGFTLAESSQYVVLMDDELALRLGADIHGAVTDVFINADGFKKSISAPGPGNYLTVAKAVASAVQIVGLDTVRHASFVHAHGSSTPANRVTESEILDRVASAFGIDGWPVTAVKAYVGHSLATASADQLISALGTFKYGILPGIKTIDKVADDVHQQRLSISNRDVRQDKPLEVCFINSKGFGGNNASGVVLSPRIAEKMLRRRHGEAAFAAYVEKREQTRGAARAYDQRALQGDLEIIYNFGQDLIDEQAIEVSAEQVTVPGFSQPLVYKKDARFSDMLD (SEQ ID NO: 37)PLES_55661 >pag:PLES_55661 putative beta-ketoacyl synthase (A)MYRLPVIVGFGGYNAAGRSSFHHGFRRMVIESMDPQARQETLAGLAVMMKLVKAEGGRYLAEDGTPLSPEDIERRYAERIFASTLVRRIEPQYLDPDAVHWHKVLELSPAEGQALTFKASPKQLPEPLPANWTIAPAEDGEVLVSIHERCEFKVDSYRALTVKSAGQLPTGFEPGELYNSRFHPRGLQMSVVAATDAIRSTGIDWKTIVDNVQPDEIAVFSGSIMSQLDDNGFGGLMQSRLKGHRVSAKQLPLGFNSMPTDFINAYVLGSVGMTGSITGACATFLYNLQKGIDVITSGQARVVIVGNSEAPILPECIEGYSAMGALATEEGLRLIEGRDDVDFRRASRPFGENCGFTLAESSQYVVLMDDELALRLGADIHGAVTDVFINADGFKKSISAPGPGNYLTVAKAVASAVQIVGLDTVRHASFVHAHGSSTPANRVTESEILDRVASAFGIDGWPVTAVKAYVGHSLATASADQLISALGTFKYGILPGIKTIDKVADDVHQQRLSISNRDMRQDKPLEVCFINSKGFGGNNASGVVLSPRIAEKMLRKRHGQAAFAAYVEKREQTRAAARAYDQRALQGDLEITYNFGQDLIDEHAIEVSAEQVTVPGFSQPLVYKKDARFSDMLD (SEQ ID NO: 38)PA14_68360 >tr|Q02EJ1|Q02EJ1_PSEAB Putative beta-ketoacyl synthase OS =Pseudomonas aeruginosa (strain UCBPP-PA14) GN = PA14_68360 PE = 3 SV = 1MSRLPVIVGEGGYNAAGRSSEHHGERRMVIESMDPQARQETLAGLAVMMKLVKAEGGRYLAEDGTPLSPEDIERRYAERIFASTLVRRIEPRYLDPDAVHWHKVLELSPAEGQALTFKASPKQLPEPLPANWSIAPAEDGEVLVSIHERCEFKVDSYRALTVKSAGQLPTGFEPGELYNSRFHPRGLQMSVVAATDAIRSTGIDWKTIVDNVQPDEIAVFSGSIMSQLDDNGFGGLMQSRLKGHRVSAKQLPLGENSMPTDFINAYVLGSVGMTGSITGACATFLYNLQKGIDVITSGQARVVIVGNSEAPILPECIEGYSAMGALATEEGLRLIEGRDDVDFRRASRPFGENCGFTLAESSQYVVLMDDELALRLGADIHGAVTDVFINADGFKKSISAPGPGNYLTVAKAVASAVQIVGLDTVRHASFVHAHGSSTPANRVTESEILDRVASAFGIDGWPVTAVKAYVGHSLATASADQLISALGTFKYGILPGIKTIDKVADDVHQQRLSISNRDMRQDKPLEVCFINSKGFGGNNASGVVLSPRIAEKMLRKRHGQAAFAAYVEKREQTRAAARAYDQRALRGDLEITYNFGQDLIDEHAIEVSAEQVTVPGFSQPLVYKKDARFSDMLD (SEQ ID NO: 39) fabHA promoterACGCCTCCTTTCCATATACCATACTCTATGAGTAAGATGAACTGATAGTTTAGACGAATATATTGCCATGTGAAAAAAAATAGGATAGAATTAGTACCTGATACTAATAATTGATCACAACCTGATTGATCTTCTAAATTTAAGATATAAAGGAGTCTTCCCTA (SEQ ID NO: 40)Proteins that prefer to initiation fatty acid synthesis using short straight chain startersfabHA >gnl|BSUB|BSU11330-MONOMER beta-ketoacyl-acyl carrier protein synthase III1208222..1209160 Bacillus subtilis subtilis 168MKAGILGVGR YIPEKVLTNH DLEKMVETSD EWIRTRTGIE ERRIAADDVF SSHMAVAAAKNALEQAEVAA EDLDMILVAT VTPDQSFPTV SCMIQEQLGA KKACAMDISA ACAGFMYGVVTGKQFIESGT YKHVLVVGVE KLSSITDWED RNTAVLFGDG AGAAVVGPVS DDRGILSFELGADGTGGQHL YLNEKRHTIM NGREVFKFAV RQMGESCVNV IEKAGLSKED VDFLIPHQANIRIMEAARER LELPVEKMSK TVHKYGNTSA ASIPISLVEE LEAGKIKDGD VVVMVGFGGGLTWGAIAIRW GR (SEQ ID NO: 41)fabHB >gnl|BSUB|BSU10170-MONOMER beta-ketoacyl-acyl carrier protein synthase III(complement(1093747..1092770)) Bacillus subtilis subtilis 168MSKAKITAIG TYAPSRRLTN ADLEKIVDTS DEWIVQRTGM RERRIADEHQ FTSDLCIEAVKNLKSRYKGT LDDVDMILVA TTTSDYAFPS TACRVQEYFG WESTGALDIN ATCAGLTYGLHLANGLITSG LHQKILVIAG ETLSKVTDYT DRTTCVLFGD AAGALLVERD EETPGFLASVQGTSGNGGDI LYRAGLRNEI NGVQLVGSGK MVQNGREVYK WAARTVPGEF ERLLHKAGLSSDDLDWFVPH SANLRMIESI CEKTPFPIEK TLTSVEHYGN TSSVSIVLAL DLAVKAGKLKKDQIVLLFGF GGGLTYTGLL IKWGM (SEQ ID NO: 42) Desaturase enzymesEF617339 >gi|148791377|gb|ABR12480.1|D9-fatty acid desaturase [Psychrobacter urativorans]MIAKTAMGLPLKGLRLAIKSSDILIQTAGTQALRLKTWYEEGKANEAASEQPTATSNVNELSPANDDTSINTKTSASTSDNNKTLSTEKPIDIRELEFKKAPINWIPATILITTPIAAAVITPWYLFTHQVSAPVWGVFGAFMVWTGISITAGYHRLLAHRAYKAHPIVKNFLLLGSTLAVQGSAFDWVSGHRSHHRHVDDRMDDPYSAKRGFFFSHIGWMLKNYPSGKFDYKNIPDLTKDRTLQIQHKYYGLWVLAANVGLVAAIGWLIGDVWGTLVLAGLLRLVLTHHFTFFINSLCHMFGSRPYTDTNTARDNFFLALFTWGEGYHNYHHFFQYDYRNGVKWWQYDPTKWLIAGLSKVGLTTELRTIDDTTIKHAEVQMQFKKAQQQIDTVNAGGLDIPHAMKTFQDRIKFEFEAFTQTVEEWQALKAKAIEMKKTEFADRLHEVDDKLKHEYANIEQKIHEHNDNLKVAFRSIGHNSKAA (SEQ ID NO:43) AB015611 >tr|O94747|O94747_MORAP Delta-9 fatty acid desaturase OS =Mortierella alpina PE = 2 SV = 1MATPLPPSFVVPATQTETRRDPLQHEELPPLFPEKITIYNIWRYLDYKHVVGLGLTPLIALYGLLTTEIQTKTLIWSITYYYATGLGITAGYHRLWAHRAYNAGPAMSFVLALLGAGAVEGSIKWWSRGHRAHHRWTDTEKDPYSAHRGLFFSHIGWMLIKRPGWKIGHADVDDLNKSKLVQWQHKNYLPLVLIMGVVFPTLVAGLGWGDWRGGYFYAAILRLVFVHHATFCVNSLAHWLGDGPFDDRHSPRDHFITAFVTLGEGYHNFHHQFPQDYRNAIRFYQYDPTKWVIALCAFFGLASHLKTFPENEVRKGQLQMIEKRVLEKKTKLQWGTPIADLPILSFEDYQHACKNDNKKWILLEGVVYDVADFMSEHPGGEKYIKMGVGKDMTAAFNGGMYDHSNAARNLLSLMRVAVVEYGGEVEAQKKNPSMPIYGTDHAKAE (SEQ ID NO: 44)AF037430 >sp|034653|DES_BACSU Fatty acid desaturase OS =Bacillus subtilis (strain 168) GN = des PE = 2 SV = 1MTEQTIAHKQKQLTKQVAAFAQPETKNSLIQLLNTFIPFFGLWFLAYLSLDVSYLLTLALTVIAAGFLTRIFIIFHDCCHQSFFKQKRYNHILGFLTGVLTLFPYLQWQHSHSIHHATSSNLDKRGTGDIWMLTVNEYKAASRRTKLAYRLYRNPFIMFILGPIYVFLITNRFNKKGARRKERVNTYLTNLAIVALAAACCLIFGWQSFLLVQGPIFLISGSIGVWLFYVQHTFEDSYFEADENWSYVQAAVEGSSFYKLPKLLQWLTGNIGYHHVHHLSPKVPNYKLEVAHEHHEPLKNVPTITLKTSLQSLAFRLWDEDNKQFVSFRAIKHIPVSLPPDSPEKQKLRKNA (SEQ ID NO: 45)Regulatory factorsDesK >gnl|BSUB|BSU19190-MONOMER DesK two-component sensory histidine kinase2090574..2091686 Bacillus subtilis subtilis 168MIKNHFTFQK LNGITPYIWT IFFILPFYFI WKSSSTFVII VGIILTLLFF SVYRFAFVSKGWTIYLWGFL LIGISTASIT LFSYIYFAFF IAYFIGNIKE RVPFHILYYV HLISAAVAANFSLVLKKEFF LTQIPFVVIT LISAILLPFS IKSRKERERL EEKLEDANER IAELVKLEERQRIARDLHDT LGQKLSLIGL KSDLARKLIY KDPEQAAREL KSVQQTARTS LNEVRKIVSSMKGIRLKDEL INIKQILEAA DIMFIYEEEK WPENISLLNE NILSMCLKEA VTNVVKHSQAKTCRVDIQQL WKEVVITVSD DGTFKGEENS FSKGHGLLGM RERLEFANGS LHIDTENGTKLTMAIPNNSK (SEQ ID NO: 46) Peptide synthetase modules srfAA module 1(condensation domain, adenylation domain, thiolation domain, it is glutamatespecific) MEITFYPLTDAQKRIWYTEKFYPHTSISNLAGIGKLVSADAIDYVLVEQAIQEFIRRNDAMRLRLRLDENGEPVQYISEYRPVDIKHTDTTEDPNAIEFISQWSREETKKPLPLYDCDLFRFSLFTIKENEVWFYANVHHVISDGISMNILGNAIMHIYLELASGSETKEGISHSFIDHVLSEQEYAQSKRFEKDKAFWNKQFESVPELVSLKRNASAGGSLDAERFSKDVPEALHQQILSFCEANKVSVLSVFQSLLAAYLYRVSGQNDVVTGTFMGNRTNAKEKQMLGMFVSTVPLRTNIDGGQAFSEFVKDRMKDLMKTLRHQKYPYNLLINDLRETKSSLTKLFTVSLEYQVMQWQKEEDLAFLTEPIFSGSGLNDVSIHVKDRWDTGKLTIDFDYRTDLFSREEINMICERMITMLENALTHPEHTIDELTLISDAEKEKLLARAGGKSVSYRKDMTIPELFQEKAELLSDHPAVVFEDRTLSYRTLHEQSARIANVLKQKGVGPDSPVAVLIERSERMITAIMGILKAGGAYVPIDPGFPAERIQYILEDCGADFILTESKVAAPEADAELIDLDQATEEGAEESLNADVNARNLAYITYTSGTTGRPKGVMIEHRQVHHLVESLQQTIYQSGSQTLRMALLAPFHFDASVKQIFASLLLGQTLYIVPKKTVTNGAALTAYYRKNSIEATDGTPAHLQMLAAAGDFEGLKLKHMLIGGEGLSSVVADKLLKLFKEAGTAPRLTNVYGPTETCVDASVHPVIPENAVQSAYVPIGKALGNNRLYILDQKGRLQPEGVAGELYIAGDGVGRGYLHLPELTEEKFLQDPFVPGDRMYRTGDVVRWLPDGTIEYLGREDDQVKVRGYRIELGETEAVIQQAPDVAKAVVLARPDEQGNLEVCAYVVQKPGSEFAPAGLREHAARQLPDYMVPAYFTEVTEIPLTPSGKVDRRKLFALEVKAVSGTAYTAPRNETEKAIAAIWQDVLNVEKAGIFDNFFETGGHSLKAMTLLTKIHKETGIEIPLQFLFEHPTITALAEEADHRESKAFAVIEPAEKQEHYPL (SEQ ID NO: 47)dptA1 module 1 of daptomycin synthetaseMDMQSQRLGVTAAQQSVWLAGQLADDHRLYHCAAYLSLTGSIDPRTLGTAVRRTLDETEALRTRFVPQDGELLQILEPGAGQLLLEADFSGDPDPERAAHDWMHAALAAPVRLDRAGTATHALLTLGPSRHLLYFGYHHIALDGYGALLHLRRLAHVYTALSNGDDPGPCPFGPLAGVLTEEAAYRDSDNHRRDGEFWTRSLAGADEAPGLSEREAGALAVPLRRTVELSGERTEKLAASAAATGARWSSLLVAATAAFVRRHAAADDTVIGLPVTARLTGPALRTPCMLANDVPLRLDARLDAPFAALLADTTRAVGTLARHQRFRGEELHRNLGGVGRTAGLARVTVNVLAYVDNIRFGDCRAVVHELSSGPVRDFHINSYGTPGTPDGVQLVFSGNPALYTATDLADHQERFLRFLDAVTADPDLPTGRHRLLSPGTRARLLDDSRGTERPVPRATLPELFAEQARRTPDAPAVQHDGTVLTYRDLHRSVERAAGRLAGLGLRTEDVVALALPKSAESVAILLGIQRAGAAYVPLDPTHPAERLARVLDDTRPRYLVTTGHIDGLSHPTPQLAAADLLREGGPEPAPGRPAPGNAAYIIQTSGSTGRPKGVVVTHEGLATLAADQIRRYRTGPDARVLQFISPGFDVFVSELSMTLLSGGCLVIPPDGLTGRHLADFLAAEAVTTTSLTPGALATMPATDLPHLRTLIVGGEVCPPEIFDQWGRGRDIVNAYGPTETTVEATAWHRDGATHGPVPLGRPTLNRRGYVLDPALEPVPDGTTGELYLAGEGLARGYVAAPGPTAERFVADPFGPPGSRMYRTGDLVRRRSGGMLEFVGRADGQVKLRGFRIELGEVQAALTALPGVRQAGVLIREDRPGDPRLVGYIVPAPGAEPDAGELRAALARTLPPHMVPWALVPLPALPLTSNGKLDRAALPVPAARAGGSGQRPVTPQEKTLCALFADVLGVTEVATDDVFFELGGHSLNGTRLLARIRTEFGTDLTLRDLFAFPTVAGLLPLLDDNGRQHTTPPLPPRPERLPLS (SEQ ID NO: 48)dptA1 module 5                                                  IDRRPERLPLSFAQRRLWFLSKLEGPSATYNIPVAVRLTGALDVPALRAALGDVTARHESLRTVFPDDGGEPRQLVLPHAEPPFLTHEVTVGEVAEQAASATGYAFDITSDTPLRATLLRVSPEEHVLVVVIHHIAGDGWSMGPLVRDLVTAYRARTRGDAPEYTPLPVQYADYALWQHAVAGDEDAPDGRTARRLGYWREMLAGLPEEHTLPADRPRPVRSSHRGGRVRFELPAGVHRSLLAVARDRRATLFMVVQAALAGLLSRLGAGDDIPIGTPVAGRGDEALDDVVGFFVNTLVLRTNLAGDPSFADLVDRVRTADLDAFAHQDVPFERLVEALAPRRSLARHPLFQIWYTLTNADQDITGQALNALPGLTGDEYPLGASAAKFDLSFTFTEHRTPDGDAAGLSVLLDYSSDLYDHGTAAALGHRLTGFFAALAADPTAPLGTVPLLTDDERDRILGDWGSGTHTPLPPRSVAEQIVRRAALDPDAVAVITAEEELSYRELERLSGETARLLADRGIGRESLVAVALPRTAGLVTTLLGVLRTGAAYLPLDTGYPAERLAHVLSDARPDLVLTHAGLAGRLPAGLAPTVLVDEPQPPAAAAPAVPTSPSGDHLAYVIHTSGSTGRPKGVAIAESSLRAFLADAVRRHDLTPHDRLLAVTTVGFDIAGLELFAPLLAGAAIVLADEDAVRDPASITSLCARHHVTVVQATPSWWRAMLDGAPADAAARLEHVRILVGGEPLPADLARVLTATGAAVTNVYGPTEATIWATAAPLTAGDDRTPGIGTPLDNWRVHILDAALGPVPPGVPGEIHTAGSGLARGYLRRPDLTAERFVANPFAPGERMYRTGDLGRFRPDGTLEHLGRVDDQVKVRGFRIELGDVEAALARHPDVGRAAAAVRPDHRGQGRLVAYVVPRPGTRGPDAGELRETVRELLPDYMVPSAQVTLTTLPHTPNGKLDRAALPAPVFGTPAGRAPATREEKILAGLFADILGLPDVGADSGFFDLGGDSVLSIQLVSRARREGLHITVRDVFEHGTVGALAAAALPAPADDADDTVPGTDVLPSISDDEFEEFELELGLEGEEEQW (SEQ ID NO: 49)Module 2 of CmnA (Sequence listing CmaA, A2)                                                       PSPEPVAEVSRAEQRIWLLSRLGGHPAEYAIPVALRLAGPLDVAKLKNAVDAVVRRHEGLRHVFPEVDGSPTRAVLDPGSITVAEEANRSVREVLAEGVAALDPATGPLARFTLVNQGPQDHVLAIVLHHLIADGWSVDVLLRDIAAHYTGAPTATPGRYADYLALERAEEQDGALGRALEHFVTALDGVPDEVSFPPDHPRPAQRTGRGDVVRHRIDAAPVTALAERLRTTPFAVLLAAVGVLLHRVGGHRDVVVGTAVARRPDAGLDHLVGLCLNTLALRWPVQPHDTLGEVVRAVTDRLADGLQHDAASFDRVVDKLAPARDSGRTPVFQVMALYEEPYETALALPDVTTTDVTVHCGSAQADAAFGFVPREGGIDLTLQFSTDVFTRATASRWARRLATLLAGARADTRVADLPLLPEDESQDLERWSGTTGEAPTTTLHALAHEIAQRHPDRPAIHFGQNSLTYGEFDARSAQLAHELRARGVRAETPVVVCLERSPEALIAVYGVLKAGGAYVPVETSNPDLRIAELIADSGAALVLTQRRLADRLAALGAEVVVVDEPLPRHPTTDPEPLTGPDHLAYVIYTSGSTGRPKGVMVQHGSVLNFLDALDRRFDLTPDDRLLHKSPLAFDVSVREVFWALTRGASVVVAEPGRHADPGHLVDLVERERVTVAHFVPSSLAVFLEGLPGPGRCPTLRHVLTSGETLPVTTARAARDLLGARLRNMYGPTETTVEMTDHDVVDDTVDRLPIGHPFEGAVVRVLDADLRPVPPGSTGELCVGGLPVARGYLGRPALTAERFVPDPLGPAGARLYRTGDLARLLPDGQLDFLGRNDFQVKVRGHRIEPGEVEAVLGALPGVHGALVTAHDDRLIGYAVTDRDGEELRTALAERLPEHLVPSVVLTLDRFPLTGNGKLDRAALPTPTGRHTGDSRPLTATEAALAAIWRDLLDVPEVRADDHFFALGGHSLLAARVAARAGAALGVALPLPTVLRFPRLADLATAVDGTRADREPVRPRPDRRRRAPLSSAQRRLWIEENLRPGTATYTVAEAFRLRGELDEEAFAAAVDDVLRRHDALRAHVESVEDGEPELVVAPEPRTALRVGDLPADRVRDALAAESARVFDPAGPLVATSLHRLAPDEWLFQFTAHHLVVDGWSLDVLWRDLAACYHDRRAGRAPRPRDGLTFTDYTWWERDVRSRDLEPHLAFWRGELAGLRPQPPADAHGPGAVLDFALGAALSDELRATAAGLGVSPFVLGLTAFALALGEDSPGAIGVEVANRASAETADLVGLEVNHVPVRVAPRGTGRAAVAAVDEARRRVLPHEHVPFDLVVDLLGPGRAPTSVAFSHLDVRGHSPRLDGVTATRLTPPHNGTAKFDLLLEVLDTEHGLTGAFEYRPERFTAARVAQVRNHWEAALLTLLADPDLPVDARRPDFA (SEQ ID NO: 50) /gene = ″mycA″ /coded_by =″AF184956.1:3161-15076″ /transl_table = 11 ORIGIN    1mytsqfqtiv dvirnrsnis drgirfiesd kietfvsyrq lfdeaggflg ylghigiqpk   61geivfqcien ksfvvafwac llggmipvpv sigedndhkl kvwriwniln npfllasetv  121ldkmkkfaad hdlqdfhhql ieksdiiqdr iydhpasgye peadelafiq fssgstgdpk  181gvmlthhnli hntcairnal aidlkdtlls wmplthdmgl iachlvpala ginqnlmpte  241lfirrpilwm kkahehkasi lsspnfgyny flkflkdnks ydwdlshiry iangaepilp  301elcdefltrc aafnmkrsai lnvyglaeas vgatfsnige rfvpvylhrd hlnlgerave  361vskedqncas fvevgkpidy cgiricnean egledgfigh igikgenvtg gyynnpestn  421raltpdgwvk tgdlgfirkg nlvvtgrekd iifvngknvy phdiervaie ledidlgrva  481acgvydgetr sreivlfavy kksadrfapl vkdikkhlyq rggwsikeil pirklpktts  541gkvkryelae qyesgkfale stkikefleg hstepvqtpi heietallsi fsevmdgkki  601hlndhyfdmg atslqlsgia erieqkfgce ltvadlftyp siadlaaflv enhseikqtd  661takpsrsssk diaiigmsln vpgasnksdf whllengehg ireypaprvk daidylrsik  721sernekqfvr ggyldeidrf dysffglapk takfmdpnqr lflqsawhai edagyagdti  781sgsqlgvyvg yskvgydyer llsanypeel hhyivgnlps vlasriayfl nlkgpavtvd  841tacssslvav hmackalltg dcemalaggi rtsllpmrig ldmessdglt ktfskdsdgt  901gsgegvaavl lkplqaaird gdhiygvikg sainqdgttv gitapspaaq teviemawkd  961agiapetlsf ieahgtgtkl gdpvefnglc kafekvtekk qfcaigsvka nighlfeaag 1021ivgliksalm lnhkkippla hfnkpnplip fhsspfyvnq evmdftpedr plrggissfg 1081fsgtnahvvl eeytpeseya pedgndphlf vlsahteasl yelthqyrqy isddsqsslr 1141sicytastgr ahldyclami vssnqelidk ltsliqgern lpqvhfgykn ikemqpaekd 1201nlskqisdlm qhrpctkder itwlnriael yvqravidwr avysnevvqk tplplypfer 1261nrcwveavye sakerkekge valdinhtkt hiesflktvi snasgirade idsnahfigf 1321gldsimltqv kkaiadefnv dipmerffdt mnniesvvdy laenvpsaas tppqesvtaq 1381eelvisgaqp elehgehmld kiiasqnqli qqtlqaqlds fnllrnnshf vskeseisqd 1441ktslspksvt akknsageak pyipfgrqtl negvnytpqg rqylesfiek yvdktkgskq 1501ytdetrfaha nnrnlssfrs ywkemvypii aersdgsrmw didgneyidi tmgfgvnlfg 1561hhpsfitqtv vdsthsalpp lgpmsnvage vadriractg vervafynsg teavmvalrl 1621araatgrtkv vvfagsyhgt fdgvlgvant kggaepanpl apgipqsfmn dliilhynhp 1681dsldvirnlg nelaavlvep vqsrrpdlqp esflkelrai tqqsgtalim deiitgfrig 1741lggagewfdi qadlvtygki igggqplgiv agkaefmnti dggtwqygdd syptdeakrt 1801fvagtfnthp ltmrmslavl rylqaegetl yerinqktty lvdqlnsyfe qsqvpirmvq 1861fgslfrfvss vdndlffyhl nykgvyvweg rncflstaht sddiayiiqa vgetvkdlrr 1921ggfipegpds pndgghkepe tyelspeqkq lavvsqygnd asaalnqsim lkvkgavqht 1981llkgavrniv krhdalrtvi hvddevqqvg arinveipii dftgypneqr esevqkwlte 2041dakrpfhfhe qkplfrvhvl tskqdehliv ltfhhiiadg wsiavfvgel estyaaivqg 2101splpshevvs frqyldwqqa qienghyeeg irywrqylse pipqailtsm sssryphgye 2161gdrytvtldr plskaiksls irmknsvfat ilgafhlflq qltkgaglvi giptagqlhm 2221kqpmlvgncv nmvpvkntas sestladylg hmkenmdqvm rhqdvpmtiv asqlphdqmp 2281dmriifnldr pfrklhfgqm eaeliaypik cisydlflnv tefdqeyvld fdfntsviss 2341eimnkwgtgf vnllkkmveg dsasldslkm fskedqhdll elyadhqlri sstldhkgvr 2401avyeepenet elgiaqiwae llglekvgrs dhflslggns lkatlmlski qqtfnqkvsi 2461gqffshqtvk elanfirgek nvkyppmkpv eqkafyrtsp aqqrvyflhq mepnqvsqnm 2521fgqisiigky dekaliaslq qvmqrheafr tsfhiidgei vggiageldf nvrvhsmdre 2581efeayadgyv kpfrleqapl vraelikvdn eqaellidmh hiisdgysms iltnelfaly 2641hgnplpeipf eykdfaewqn qlligevmeg geeywlegfk gevpilqlpa dgsramewss 2701eggrvtcslq sslirslqem aqqkgttlym vllaaynvll hkytgqediv vgtpvsgrnq 2761pniesmigif iqtmgirtkp gankrftdyl devkrqtlda fenqdypfdw lvekvnvqre 2821ttgkslfntm fvygniefge ihqdgctfry kernpgvsly dlmltiedae kqldihfdfn 2881pngfegetie qiirhytsll dslvkepeks lssvpmlsdi erhqllmgcn dtetpfphnd 2941tvcqwfetqa eqrpddeavi fgnerctygq lnervnglar tlrtkgvqad qfvaiicphr 3001ielivgilav lkaggayvpi dpeypedriq ymlkdseaki vlaqldlhkh ltfdadvvll 3061deessyhedr snleptcgan dlaymiytsg stgnpkgvli ehrglanyie wakevyvnde 3121ktnfplyssi sfdltvtsif tplvtgntii vfdgedksav lstimgdpri diikltpahl 3181hvlkemkiad gttirkmivg genlstrlaq syseqfkgql difneygpte avvgcmiyry 3241dtkrdrrefv pigspaants iyvldasmnl vpvgvpgemy iggagvargy wnrpdltaek 3301fvhnpfapgt imyktgdlak rlidgnliyl grideqvkir ghrielgeve aamhkveavq 3361kavvlareee dglqqlcayy vsnkpitiae ireqlslelp dymvpshyiq leqlpltsng 3421kinrkalpap evsleqiaey vppgnevesk lavlwqemlg ihrvgikhnf fdlggnsira 3481talaarihke ldvnlsvkdi fkfptieqla nmalrmekir yvsipsaqki syypvssaqk 3541rmyllshteg geltynmtga msvegaidle rltaafqkli erhevlrtsf elyegepaqr 3601ihpsieftie gigareeeve dhvldfiksf dlakpplmry glieltpekh vllvdmhhii 3661sdgvsmnilm kdlnqfykgi epdplpiqyk dyavwqqtea grgnikkgea ywlnrfhdei 3721pvldmptdye rpairdyege sfeflipiel kgrlsgmeea tgttlymilm aaytillsky 3781sgqedivvgt pvsgrshmdv esvvgmfvnt lvirnhpagr kifedylnev kenmlnayqn 3841gdypleeliq hvhllkdssr nplfdtmfvl qnldqvelnl dslrftpykl hhtvakfdlt 3901lsiqtdqdkh hglfeyskkl fkksrieals kdylhilsvi squsigieh ielsgstaed 3961dnlihsieln f (SEQ ID NO: 51) Psrf-Gly-lgr_m2-F3-TE-pUC19    1TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCG   50   51GAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCG  100  101TCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATG  150  151CGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATCGACAAAAATG  200  201TCATGAAAGAATCGTTGTAAGACGCTCTTCGCAAGGGTGTCTTTTTTTGC  250  251CTTTTTTTCGGTTTTTGCGCGGTACACATAGTCATGTAAAGATTGTAAAT  300  301TGCATTCAGCAATAAAAAAAGATTGAACGCAGCAGTTTGGTTTAAAAATT  350  351TTTATTTTTCTGTAAATAATGTTTAGTGGAAATGATTGCGGCATCCCGCA  400  401AAAAATATTGCTGTAAATAAACTGGAATCTTTCGGCATCCCGCATGAAAC  450  451TTTTCACCCATTTTTCGGTGATAAAAACATTTTTTTCATTTAAACTGAAC  500  501GGTAGAAAGATAAAAAATATTGAAAACAATGAATAAATAGCCAAAATTGG  550  551TTTCTTATTAGGGTGGGGTCTTGCGGTCTTTATCCGCTTATGTTAAACGC  600  601CGCAATGCTGACTGACGGCAGCCTGCTTTAATAGCGGCCATCTGTTTTTT  650  651GATTGGAAGCACTGCTTTTTAAGTGTAGTACTTTGGGCTATTTCGGCTGT  700  701TAGTTCATAAGAATTAAAAGCTGATATGGATAAGAAAGAGAAAATGCGTT  750  751GCACATGTTCACTGCTTATAAAGATTAGGGGAGGTATGACAATATGGAAA  800  801TAACTTTTTACCCTTTAACGGATGCACAAAAACGAATTTGGTACACAGAA  850  851AAATTTTATCCTCACACGAGCATTTCAAATCTTGCGGGGATTGGTAAGCT  900  901GGTTTCAGCTGATGCGATTGATTATGTGCTTGTTGAGCAGGCGATTCAAG  950  951AGTTTATTCGCAGAAATGACGCCATGCGCCTTCGGTTGCGGCTAGATGAA 1000 1001AACGGGGAGCCTGTTCAATATATTAGCGAGTATCGGCCTGTTGATATAAA 1050 1051ACATACTGACACTACTGAAGATCCGAATGCGATAGAGTTTATTTCACAAT 1100 1101GGAGCCGGGAGGAAACGAAGAAACCTTTGCCGCTATACGATTGTGATTTG 1150 1151TTCCGTTTTTCCTTGTTCACCATAAAGGAAAATGAAGTGTGGTTTTACGC 1200 1201AAATGTTCATCACGTGATTTCTGATGGTATGTCCATGAATATTGTCGGGA 1250 1251ATGCGATCATGCACATTTATTTAGAATTAGCCAGCGGCTCAGAGACAAAA 1300 1301GAAGGAATCTCGCATTCATTTATCGATCATGTTTTATCTGAACAGGAATA 1350 1351TGCTCAATCGAAGCGGTTTGAAAAGGACAAGGCGTTTTGGAACAAACAAT 1400 1401TTGAATCGGTGCCTGAACTTGTTTCCTTGAAACGGAATGCATCCGCAGGG 1450 1451GGAAGTTTAGATGCTGAGAGGTTCTCTAAAGATGTGCCTGAAGCGCTTCA 1500 1501TCAGCAGATTCTGTCGTTTTGTGAGGCGAATAAAGTCAGTGTTCTTTCGG 1550 1551TATTTCAATCGCTGCTCGCCGCCTATTTGTACAGGGTCAGCGGCCAGAAT 1600 1601GATGTTGTGACGGGAACATTTATGGGCAACCGGCAAAATGCGAAAGAGAA 1650 1651GCAGATGCTTGGCATGTTTGTTTCTACGGTTCCGCTTCGGACAAACATTG 1700 1701ACGGCGGGCAGGCGTTTTCAGAATTTGTCAAAGACCGGATGAAGGATCTG 1750 1751ATGAAGACACTTCGCCACCAAAAGTATCCGTATAATCTCCTAATCAACGA 1800 1801TTTGCGTGAAACAAAGAGCTCTCTGACCAAGCTGTTCACGGTTTCTCTTG 1850 1851AATATCAAGTGATGCAGTGGCAGAAAGAAGAGGATCTTGCCTTTTTGACT 1900 1901GAGCCGATTTTCAGCGGCAGCGGATTAAATGATGTCTCAATTCATGTAAA 1950 1951GGATCGATGGGATACTGGGAAACTCACCATAGATTTTGATTACCGCACTG 2000 2001ATTTATTTTCACGTGAAGAAATCAACATGATTTGTGAGCGCATGATTACC 2050 2051ATGCTGGAGAACGCGTTAACGCATCCAGAACATACAATTGATGAATTAAC 2100 2101ACTGATTTCTGATGCGGAGAAACGCGATTTGTTTTTGCGGGTGAACGATA 2150 2151CAGCCAAGGCGTATCCGAACAAGCTGATCATGTCGATGCTGGAGGATTGG 2200 2201GCGGCGGCTACCCCTGACAAAACAGCGCTAGTCTTCCGCGAACAACGCGT 2250 2251GACGTATCGCGAGCTGAACGAGCGGGTCAACCAGTTGGCACACACTTTGC 2300 2301GCGAAAAAGGGGTGCAACCTGACGATCTCGTGATGCTGATGGCAGAGCGG 2350 2351TCGGTCGAGATGATGGTGGCGATTTTCGCTGTGTTGAAAGCGGGCGGAGC 2400 2401GTACTTGCCCATCGACCCGCACAGTCCGGCGGAGCGAATCGCCTACATTT 2450 2451TCGCAGACAGCGGAGCCAAGCTGGTGCTGGCACAGTCGCCGTTTGTGGAA 2500 2501AAGGCAAGCATGGCGGAAGTGGTCCTTGATCTGAACAGTGCGAGCAGCTA 2550 2551TGCGGCGGATACGAGCAACCCGCCACTGGTCAACCAGCCAGGCGATCTGG 2600 2601TGTATGTCATGTACACTTCCGGCTCAACGGGAAAACCAAAAGGCGTGATG 2650 2651ATCGAGCACGGAGCGCTGCTCAATGTGCTTCACGGAATGCAGGACGAGTA 2700 2701CCCGCTTTTGCAGGACGATGCCTTCTTGCTCAAGACAACCTACATATTCG 2750 2751ATATTTCAGTCGCGGAAATTTTCGGGTGGGTTCCGGGTCGTGGCAAACTG 2800 2801GTGATTTTGGAACCGGAGGCGGAAAAGAACCCGAAGGCTATTTGGCAGGC 2850 2851GGTAGTCGGAGCGGGAATTACCCACATCAACTTCGTGCCCTCCATGCTGA 2900 2901TCCCGTTTGTCGAGTATTTGGAAGGGCGAACAGAAGCAAATCGCTTGCGG 2950 2951TACATCTTGGCTTGCGGCGAAGCGATGCCGGATGAACTCGTGCCAAAAGT 3000 3001GTACGAAGTATTGCCAGAGGTGAAGCTGGAAAACATCTACGGCCCGACAG 3050 3051AAGCGACGATTTACGCTTCCCGTTACTCGCTCGCGAAAGGCTCGCAGGAA 3100 3101AGTCCTGTTCCAATCGGAAAGCCGCTGCCCAACTATCGCATGTATATCAT 3150 3151CAATCGGCATGGACAACTGCAACCAATCGGCGTACCAGGAGAGCTATGCA 3200 3201TCGCAGGAGCAAGTCTGGCGAGAGGGTATTTGAACAATCCAGCGCTGACA 3250 3251GAAGAAAAATTCACTCCTCATCCGCTGGAGAAAGGCGAGCGGATTTATCG 3300 3301CACGGGTGATCTCGCCCGTTATCGCGAGGATGGCAACATCGAATACCTCG 3350 3351GACGGATGGACCATCAGGTGAAAATTCGCGGATACCGGATCGAACTGGAC 3400 3401GAAATCCGCAGCAAGCTGATTCAGGAGGAAACGATTCAGGACGCGGTGGT 3450 3451CGTAGCCCGAAACGATCAAAACGGCCAAGCGTACTTGTGCGCCTACCTGC 3500 3501TGTCCGAACAGGAGTGGACAGTCGGTCAACTGCGCGAGTTGCTTCGCCGT 3550 3551GAACTGCCTGAATACATGATTCCGGCCCATTTCGTTTTGCTGAAACAGTT 3600 3601CCCGCTCACAGCCAATGGCAAGCTCGATCGCAAGGCTTTGCCAGAACCGG 3650 3651ACGGCAGTGTGAAAGCGGAAGCGGAATATGCAGCGCCGCGCACGGAACTG 3700 3701GAAGCGACTTTGGCGCACATTTGGGGCGAAGTGCTCGGAATCGAACGGAT 3750 3751CGGGATTCGCGACGATTTCTTTGCGCTCGGAGGGCATTCCTTGAAGGCCA 3800 3801TGACCGCCGTCCCGCATCAACAAGAGCTCGGGATTGATCTTCCAGTGAAG 3850 3851CTTTTGTTTGAAGCGCCGACGATCGCCGGCATTTCAGCGTATTTGAAAAA 3900 3901CGGGGGCTCTGATGGCTTGCAGGATGTAACGATAATGAATCAGGATCAGG 3950 3951AGCAGATCATTTTCGCATTTCCGCCGGTTCTGGGCTATGGCCTTATGTAC 4000 4001CAAAATCTGTCCAGCCGCTTGCCGTCATACAAGCTATGCGCCTTTGATTT 4050 4051TATTGAGGAGGAAGACCGGCTTGACCGCTATGCGGATTTGATCCAGAAGC 4100 4101TGCAGCCGGAAGGGCCTTTAACATTGTTTGGATATTCAGCGGGATGCAGC 4150 4151CTGGCGTTTGAAGCTGCGAAAAAGCTTGAGGAACAAGGCCGTATTGTTCA 4200 4201GCGGATCATCATGGTGGATTCCTATAAAAAACAAGGTGTCAGTGATCTGG 4250 4251ACGGACGCACGGTTGAAAGTGATGTCGAAGCGTTGATGAATGTCAATCGG 4300 4301GACAATGAAGCGCTCAACAGCGAAGCCGTCAAACACGGCCTCAAGCAAAA 4350 4351AACACATGCCTTTTACTCATACTACGTCAACCTGATCAGCACAGGCCAGG 4400 4401TGAAAGCAGATATTGATCTGTTGACTTCCGGCGCTGATTTTGACATGCCG 4450 4451GAATGGCTTGCATCATGGGAAGAAGCTACAACAGGTGTTTACCGTGTGAA 4500 4501AAGAGGCTTCGGAACACACGCAGAAATGCTGCAGGGCGAAACGCTAGATA 4550 4551GGAATGCGGAGATTTTGCTCGAATTTCTTAATACACAAACCGTAACGGTT 4600 4601TCATAAAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGT 4650 4651GTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCA 4700 4701TAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATT 4750 4751GCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCT 4800 4801GCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGC 4850 4851GCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTG 4900 4901CGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGA 4950 4951ATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGG 5000 5001CCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGC 5050 5051CCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAA 5100 5101CCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCG 5150 5151TGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTT 5200 5201CTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCT 5250 5251CAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCC 5300 5301CCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCC 5350 5351AACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAG 5400 5401GATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGT 5450 5451GGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTG 5500 5501CTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAA 5550 5551ACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTA 5600 5601CGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGG 5650 5651TCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAG 5700 5701ATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTT 5750 5751TTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAA 5800 5801TGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATC 5850 5851CATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCT 5900 5901TACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCG 5950 5951GCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAG 6000 6001AAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCC 6050 6051GGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTT 6100 6101GCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTC 6150 6151ATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGT 6200 6201TGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGT 6250 6251AAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTC 6300 6301TCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT 6350 6351CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGC 6400 6401CCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGT 6450 6451GCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTAC 6500 6501CGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCT 6550 6551TCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAG 6600 6601GCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATAC 6650 6651TCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGT 6700 6701CTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGG 6750 6751GGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCA 6800 6801TTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTT 6850 6851 CGTC6854 (SEQ ID NO: 52) CoA Ligases GenBank: AAX31555.1acyl-CoA ligase [Streptomyces roseosporus NRRL 11379]GenPept Graphics >gi|60650930|gb|AAX31555.1|acyl-CoA ligase [Streptomyces roseosporus NRRL 11379]MSESRCAGQGLVGALRTWARTRARETAVVLVRDTGTTDDTASVDYGQLDEWARSIAVTLRQQ LAPGGRALLLLPSGPEFTAAYLGCLYAGLAAVPAPLPGGRHFERRRVAAIAADSGAGVVLIVAGETASVH DWLTETTAPATRVVAVDDRAALGDPAQWDDPGVAPDDVALIQYTSGSTGNPKGVVVTHANLLANARNLAE ACELTAATPMGGWLPMYHDMGLLGTLTPALYLGTTCVLMSSTAFIKRPHLWLRTIDRFGLVWSSAPDFAY DMCLKRVTDEQIAGLDLSRWRWAGNGAEPIRAATVRAFGERFARYGLRPEALTAGYGLAEATLFVSRSQG LHTARVATAALERHEFRLAVPGEAAREIVSCGPVGHFRARIVEPGGHRVLPPGQVGELVLQGAAVCAGYW QAKEETEQTFGLTLDGEDGHWLRTGDLAALHEGNLHITGRCKEALVIRGRNLYPQDIEHELRLQHPELES VGAAFTVPAAPGTPGLMVVHEVRTPVPADDHPALVSALRGTINREFGLDAQGIALVSRGTVLRTTSGKVR RGAMRDLCLRGELNIVHADKGWHAIAGTAGEDIAPTDHAPHPHPA (SEQ ID NO: 53)Acyl Carrier proteins GenBank: AAX31556.1probable acyl carrier protein [Streptomyces roseosporus NRRL 11379]GenPept Graphics >gi|60650931|gb|AAX31556.1|probable acyl carrier protein [Streptomyces roseosporus NRRL 11379]MNPPEAVSTPSEVTAWITGQIAEFVNETPDRIAGDAPLTDHGLDSVSGVALCAQVEDRYGIE VDPELLWSVPTLNEFVQALMPQLADRT (SEQ ID NO: 54) malonyl-CoA transacylase/protein_id = ″AAF08794.1″ /gene = ″fenF″ /note =″malonyl-CoA transacylase″ /codon_start = 1 /transl_table = 11/product = ″FenF″ /db_xref = ″GI:6449054″ /translation =″MNNLAFLFPGQGSQFVGMGKSFWNDFVLAKRLFEEASDAISMDVKKLCFDGDMTELTRTMNAQPAILTVSVIAYQVYMQEIGIKPHFLAGHSLGEYSALVCAGVLSFQEAVKLIRQRGILMQNADPEQLGTMAAITQVYIQPLQDLCTEISTEDFPVGVACMNSDQQHVISGHRQAVEFVIKKAERMGANHTYLNVSAPFHSSMMRSASEQFQTALNQYSFRDAEWPIISNVTAIPYNNGHSVREHLQTHMTMPVRWAESMHYLLLHGVTEVIEMGPKNVLVGLLKKITNHIAAYPLGQTSDLHLLSDSAERNENIVNLRKKQLNKMMIQSIIARNYNKDAKTYSNLTTPLFPQIQLLKERVERKEVELSAEELEHSIHLCQLICEAKQLPT WEQLRILK″(SEQ ID NO: 55)

We claim:
 1. A method of making an acyl amino acid composition bycontacting an engineered peptide synthetase with an amino acid substrateand an acyl entity substrate for the engineered peptide synthetase,under conditions and for a time sufficient for an acyl amino acidcomposition to be made. 1A. The method of claim 1, wherein theengineered peptide synthetase includes an adenylation (A) domain, athiolation (T) domain, and a condensation (C) domain. 1A1. The method ofclaim 1 or claim 1A, wherein the engineered peptide synthetase lacksthioesterase domain, and/or a reductase domain. 1A1a. The method ofclaim 1 or 1A1, wherein the engineered peptide synthetase contains onlya single peptide synthetase domain. 1A2. The method of claim 1 or 1A1a,wherein the engineered peptide synthetase is or comprises a peptidesynthetase domain found in as a first domain in a peptide synthetasethat synthesizes a lipopeptide. 1B. The method of claim 1 or claim 1A,wherein acyl amino acid composition includes, as a prominent component,an acyl amino acid whose amino acid moiety is from an amino acidselected from the group consisting of glycine or glutamate and whoseacyl moiety is from a fatty acid selected from the group consisting ofmyristic acid and or lauric acid. 1C. The method of any one of thepreceding claims, wherein the step of contacting comprises providing acell engineered to express at least one engineered peptide synthetase.2. A cell engineered to express at least one engineered peptidesynthetase that synthesizes an acyl amino acid.
 3. An acyl amino acidcomposition produced by an engineered peptide synthetase. 3B. Thecomposition of claim 3, wherein substantially all of the acyl aminoacids in the composition contain the same amino acid component. 3C. Thecomposition of any one of claims 3-3B, wherein acyl amino acids in thecomposition comprise different acyl moieties.
 4. A method of preparing aproduct comprising: providing or obtaining an acyl amino acidcomposition prepared in an engineered microbial cell; enriching the acylamino acid composition for a particular acyl amino acid; combining theenriched acyl amino acid composition with at least one other componentto produce a product.
 5. A method comprising steps of: contacting anengineered peptide synthetase polypeptide that comprises a singlepeptide synthetase domain and lacks a thioesterase domain, and/or areductase domain with: an amino acid substrate of the peptide synthetasepolypeptide; and an acyl moiety substrate of the peptide synthetasepolypeptide, the contacting being performed under conditions and for atime sufficient that the engineered peptide synthetase polypeptidecovalently links the acyl moiety from the acyl moiety substrate to theamino acid so that an acyl amino acid is generated. 5A. The method ofclaim 5, wherein the engineered peptide synthetase polypeptide isproduced by a cell. 5A1. The method of claim 5A, wherein the cell is amicrobial cell. 5A1a. The method of claim 5A1, wherein the cell is abacterial cell. 5A2. The method of any one of claims 1A, 1A1, or 1A1a,wherein the step of contacting comprises contacting the cell with thesubstrates. 5B. The method of claim 5, wherein the cell is an engineeredcell. 1C. The method of claim 5B, wherein the cell is engineered in thatthe peptide synthetase polypeptide is an engineered peptide synthetasepolypeptide.